今天我们想要提取一下新网微博热搜的题目和热度
利用之前我们所学的知识,今天这个问题应该不难解决
我直接展示一下代码,代码里的注释大家可以看一下:
import requests from bs4 import BeautifulSoup #定义url和请求头 url = 'https://s.weibo.com/top/summary?Refer=top_hot&topnav=1&wvr=6' headers = { "User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.106 Safari/537.36"} #发送请求 response = requests.get(url,headers=headers) content = response.content.decode('utf8') #实例化BeautifulSoup对象 soup = BeautifulSoup(content,'lxml') #提取数据 sinas=[] tds = soup.find_all('td',class_="td-02")[1:] for td in tds: #热搜的内容 event = td.find_all('a')[0].string # print(event) # break #热度 hot = td.find_all('span')[0].string # print(event,hot) # break sina = { 'event':event, 'hot':hot } sinas.append(sina) print(sinas)
最后打印出来的结果: