前言:
用过很多种代理使用方式,这里总结一下.
1、urllib+socks5的代理1
from sockshandler import SocksiPyHandler
import socks
from urllib.request import build_opener headers = {
'Accept': 'text/html, application/xhtml+xml, image/jxr, */*',
'Accept - Encoding': 'gzip, deflate',
'Accept-Language': 'zh-Hans-CN, zh-Hans; q=0.5',
# 'Connection': 'Keep-Alive',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36'
} url = 'https://www.google.com/' proxy_handler = SocksiPyHandler(proxytype=socks.SOCKS5, proxyaddr='***', proxyport=0000, username='***', password='***')
opener = build_opener(proxy_handler)
opener.addheaders = [(k, v) for k, v in headers.items()]
resp = opener.open(url, timeout=30) resp_html = resp.read()
print(resp_html.decode())
2、如果本机挂了*代理,代理端口为1080,则代码可以修改为:
proxy_handler = SocksiPyHandler(proxytype=socks.HTTP, proxyaddr='127.0.0.1', proxyport=1080)
3、urllib+socks5的代理2
from urllib.request import ProxyHandler, build_opener headers = {
'Accept': 'text/html, application/xhtml+xml, image/jxr, */*',
'Accept - Encoding': 'gzip, deflate',
'Accept-Language': 'zh-Hans-CN, zh-Hans; q=0.5',
# 'Connection': 'Keep-Alive',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36'
}
proxies = { "socks5": "socks5://user:pwd@ip:port" } url = 'https://www.baidu.com/' proxy_handler = ProxyHandler(proxies) opener = build_opener(proxy_handler)
opener.addheaders = [(k, v) for k, v in headers.items()]
resp = opener.open(url, timeout=30) resp_html = resp.read()
print(resp_html.decode())
以上代码可以运行,但是感觉速度要慢一些。以上测试了,如果本地http代理,可以写成:
proxies = {
"http": "http://127.0.0.1:1080/",
"https": "https://127.0.0.1:1080/"
}
4、使用socks设置全局代理
import socks
import socket socks.set_default_proxy(proxy_type=socks.SOCKS5, addr="***", port=000, rdns=True, username='***', password='***')
socket.socket = socks.socksocket
安装socks:
pip install PySocks
# 会安装 socks 和 sockshandler 两个模块
5、使用requests设置代理
import requests as s headers = {
'Accept': 'text/html, application/xhtml+xml, image/jxr, */*',
'Accept - Encoding': 'gzip, deflate',
'Accept-Language': 'zh-Hans-CN, zh-Hans; q=0.5',
'Connection': 'Keep-Alive',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36'
} proxies = {
"http": "socks5://user:pwd@ip:port/",
"https": "socks5://user:pwd@ip:port/"
} # proxies = {
# "socks5": "socks5://user:pwd@ip:port"
# } # proxies = {
# "http": "http://127.0.0.1:1080/",
# "https": "https://127.0.0.1:1080/"
# } # url = 'https://www.google.com/'
# url = 'https://search.yahoo.com/'
url = 'https://www.baidu.com/' resp = s.get(url=url, proxies=proxies, headers=headers) print(resp.text)