爬虫:爬取了wallpaper练练手

爬了个wallpaper练练手

刚学了点爬虫,爬了个图片非常好看的网站:https://wallhaven.cc/hot

比较适合入门,欢迎交流爬虫:爬取了wallpaper练练手

import requests
from bs4 import BeautifulSoup
import time

# 目标网页url
url = "https://wallhaven.cc/hot"

# 请求响应
resp = requests.get(url)
resp.encoding = "utf-8"

# 解析网页?
bsobj = BeautifulSoup(resp.text, "html.parser")
imglist = bsobj.find("section", attrs={"class":"thumb-listing-page"}).find_all("a", attrs={"class":"preview"})
# print(imglist[:4])
for img in imglist:
    img = str(img)
    child_url = img[img.index("https"):img.index("\" target")]
    # print(child_url)
    child_resp = requests.get(child_url)
    # print(child_resp)
    child_bsobj = BeautifulSoup(child_resp.text, "html.parser")
    before_src = child_bsobj.find("img", attrs={"id":"wallpaper"})
    # print(before_src.get("src"))
    src = before_src.get("src")
    src_file = requests.get(src)

    img_name = src.split("/")[-1]
    with open("wallpaper/" + img_name, mode="wb") as f:
        f.write(src_file.content)
    print("finish 111")
    child_resp.close()
    time.sleep(1)

print("all finish !!!")
# print(resp)

resp.close()

上一篇:sqlserver Change Data Capture&Change Tracking


下一篇:(转)MongoDB mongo.exe启动及闪退解决