人人网登录地址:http://www.renren.com/
此处登录没有考虑验证码验证码。
首先对登录方法进行分析
有两种方法。
一)在Elements中分析源码
发现登录点击后的事件是http://www.renren.com/PLogin.do
二)在Network中分析网络请求
请求链接:http://www.renren.com/ajaxLogin/login?1=1&uniqueTimestamp=2017110237292
表单数据 :
email 账号用户名
icode 验证码,可为空
origURL : http://www.renren.com/home
domain:renren.com
key_id:1
captcha_type:web_login
password: 密码,需要对输入的密码进行加密处理
rkey: 密码处理
f: 未知
此处采取直接使用Elements发现的触发事件。
1 package 人人网模拟登录;
2
3 import org.apache.http.Header;
4 import org.apache.http.NameValuePair;
5 import org.apache.http.client.ResponseHandler;
6 import org.apache.http.client.entity.UrlEncodedFormEntity;
7 import org.apache.http.client.methods.CloseableHttpResponse;
8 import org.apache.http.client.methods.HttpGet;
9 import org.apache.http.client.methods.HttpPost;
10 import org.apache.http.impl.client.BasicResponseHandler;
11 import org.apache.http.impl.client.CloseableHttpClient;
12 import org.apache.http.impl.client.HttpClients;
13 import org.apache.http.message.BasicNameValuePair;
14 import java.util.ArrayList;
15 import java.util.List;
16
17 public class Renren {
18 public static void main(String[] args) throws Exception{
19 CloseableHttpClient closeableHttpClient = HttpClients.createDefault() ;
20 HttpPost httpPost = new HttpPost("http://www.renren.com/PLogin.do") ;
21
22 String userName = " " ; // 账号写入
23 String passWord = " " ; // 密码写入
24 List<NameValuePair> dlbd = new ArrayList<NameValuePair>();
25 // 登录表单设置
26 dlbd.add(new BasicNameValuePair("domain", "renren.com"));
27 dlbd.add(new BasicNameValuePair("isplogin", "true"));
28 dlbd.add(new BasicNameValuePair("submit", "登录"));
29 dlbd.add(new BasicNameValuePair("email", userName));
30 dlbd.add(new BasicNameValuePair("password", passWord));
31 httpPost.setEntity(new UrlEncodedFormEntity(dlbd));
32 // Post请求
33 CloseableHttpResponse closeableHttpResponse = closeableHttpClient.execute(httpPost) ;
34 // 获取响应头
35 Header locationHeader = closeableHttpResponse.getFirstHeader("Location");
36 // Get请求
37 String header = locationHeader.getValue();
38 HttpGet httpGet = new HttpGet(header) ;
39 ResponseHandler<String> responseHandler = new BasicResponseHandler();
40 String responseBody = closeableHttpClient.execute(httpGet, responseHandler);
41 System.out.println(responseBody);
42 }
43 }
登录成功
如果之前在网页登录失败次数过多,可能会导致爬虫模拟登录需要验证码,而此处是考虑不需要验证码的情况,所以可能会登录失败,解决方法可以是清理本机Cookie。