Python实现模拟登陆

大家经常会用Python进行数据挖掘的说,但是有些网站是需要登陆才能看到内容的,那怎么用Python实现模拟登陆呢?其实网路上关于这方面的描述很多,不过前些日子遇到了一个需要cookie才能登陆的网站,而且这个网站还有些问题,于是费了好大的劲才搞定,现在贴出来给大家分享下。

首先是用Python3标准库里的urllib包实现的一个版本,不需要考虑许多细节:

 #! /usr/bin/env python
# -*- coding:utf-8 -*- import urllib.request
import urllib.parse
import http.cookiejar StudentInfoURL = 'http://210.x.x.1:90/student/index.jsp'
loginURL = 'http://210.x.x.1:90/login.jsp'
loginCheckURL = 'http://210.x.x.1:90/j_security_check'
post_data = urllib.parse.urlencode({'j_username': 'xxxxxxx', 'j_password': 'xxxxxxx'})
headers = {
'Content-Type': 'application/x-www-form-urlencoded',
'UserAgent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36'
} cj = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
#此处一定要链接一次,否则得不到cookie
opener.open(loginCheckURL)
urllib.request.install_opener(opener) ######################此处加入异常处理,再登一次即可######################
request = urllib.request.Request(loginCheckURL, post_data, headers)
try:
response = urllib.request.urlopen(request)
except:
response = urllib.request.urlopen(request)
print(response.read().decode('GBK')) ######################可以开始正常访问啦######################
request = urllib.request.Request(StudentInfoURL, headers=headers)
fp = urllib.request.urlopen(request)
print(fp.read().decode('GBK'))

下面是另一个版本,用的是比较底层的http包里的client模块实现的,个人很喜欢这个版本:

 #!/usr/bin/env python
# -*- coding:utf-8 -*- import http.client ###########################################################
HOST = '210.x.x.1:90'
UserName = "xxxxxxx"
PassWord = "xxxxxxx"
data = "j_username=%s&j_password=%s" %(UserName,PassWord)
Headers = {
"Content-Type":"application/x-www-form-urlencoded",
"User-Agent":"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729)",
}
########################################################### #连接服务器
conn = http.client.HTTPConnection(HOST,timeout=30)
conn.connect() #GET到登录页,以获取cookies
conn.request("GET","/j_security_check",None,Headers)
res = conn.getresponse()
m_cookie = res.getheader("Set-Cookie").split(';')[0]
res.read() #POST到登录页,进行登录
Headers["Cookie"] = m_cookie
conn.request("POST","/j_security_check",data,Headers)
res = conn.getresponse()
res.read()
if res.status == 400:
#再次链接到登录页
conn.request("POST","/j_security_check",data,Headers)
res = conn.getresponse()
res.read()
conn.close() ######################可以开始正常访问啦######################
conn2 = http.client.HTTPConnection(HOST)
conn2.request("GET","/student/index.jsp",None,Headers)
fp = conn2.getresponse()
print(fp.status)
print(fp.read().decode("GBK"))
###########################################################

欢迎大家批评

上一篇:iOS可视化动态绘制连通图(Swift版)


下一篇:绘制图形与3D增强技巧(一)----点图元