安装流程:
https://github.com/lengziyu/vue-seo-phantomjs
为了方便,启动不用vue-seo-phantomjs 使用的pm2,而是使用nohup node server.js & 运行后台进程
由于vue站点有不同的url和参数,处理搜索引擎访问代理转发时需要修改如下配置即可抓取带参数的页面数据
location / {
if ($http_user_agent ~* "Baiduspider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator|bingbot|Sosospider|Sogou Pic Spider|Googlebot|360Spider"){
proxy_pass http://127.0.0.1:8081$uri$is_args$query_string;
}
}
模拟抓取工具:https://www.dute.org/crawler
测试模拟抓取地址:
https://www.mxreality.cn/contact
https://www.mxreality.cn/?min_id=999
可以看到已经抓到动态页面数据了