先上效果:
开始步骤:
1.百度找我喜欢的图片,太多了,慢慢来,哦哦哦——————-。终于黄天不负有心人,
2.找到了:
3.开搞:起来
(1)伪装浏览器(俺用的FIDDLER抓包,模拟谷歌吧)
def hander_request1(url, page, i):
url = url + str(i) + '.html'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Safari/537.36',
}
request = urllib.request.Request(url, headers=headers)
return request
# print(url)
pass
2.正则拆分表单(这里写的复杂些)
part = re.compile(r'')
lt = part.findall(cont)
dirname = '美女'
# urllib.request.urlretrieve(str(lt), filepath)
print(lt)
url1 = str(lt).split('"')[1]
print(url1)
f1 = str(lt).split('"')[-2]
filename = f1
print(filename + ' 开始下载')
filepath = dirname + '/' + filename + '.jpg'
if not os.path.exists(dirname):
os.mkdir(dirname)
# nt=mt.split()[0]
3.保存文件路径和名称
requset1 = urllib.request.Request(url=url1, headers=hd)
response1 = urllib.request.urlopen(requset1)
# urllib.request.urlretrieve(url1, filepath)
wenjianming = filename + '.jpg'
with open(wenjianming, 'wb') as fp:
fp.write(response1.read())
# print(mt+'下载完成')
print(filename + ' 完成下载')
4.俺的图片都是分类的,套图得明白??
写了两个循环
def main():
url = 'http://www.kantuba.net/guonei/'
start_page = int(input('输入开始页码:'))
end_page = int(input('请输入结束页码:'))
#i网页计数器,可以用
i = 0
page = 0
if start_page == 1:
for i in range(10000, 10020):
request = hander_request1(url, page, i)
cont = urllib.request.urlopen(request).read().decode()
download_image(cont)
for page in range(start_page + 1, end_page):
request = hander_request(url, page, i)
cont = urllib.request.urlopen(request).read().decode()
download_image(cont)
# wenjianming = str(i) + str(page) + '.html'
# with open(wenjianming, 'wb') as fp:
# fp.write(download_image(cont))
# # time.sleep(1)
# print(wenjianming + 'OK!')
elif start_page != 1:
for i in range(10000, 10020):
for page in range(start_page, end_page):
request = hander_request(url, page, i)
cont = urllib.request.urlopen(request).read().decode()
download_image(cont)
pass
5.亲测效果杠杠的,拿走即可。拿回去只需要改正则表达式和URL。即可,被窝里看别忘了感谢我哦哦,啧啧啧!拿走不谢!