xpath 总结

2022-09-18 20:42:54

例如

<tr id="row_1716002" class="ni2" name="2016-04-12" gamename="欧冠杯" polygoal="-2" matchid="1716002"> hello</tr>

</table>

选择selector

1)根据属性值选择

//*[@id="MatchTable"]

table = response.xpath('//*[@id="MatchTable"]')

2)根据是否具有某个属性选择元素

tr[@matchid]

tr = table.xpath('tr[@matchid]')

3)选择内容

<a href="http://cms.8win.com/zybl/one-201604236003"> HelloWorld</a>

3.1)选择text

selector.xpath('text').extract_first() # HelloWorld

3.2)选择href的text

selector.xpath('@href').extract_first() # http://cms.8win.com/zybl/one-201604236003

4)css选择

<p class="abstract">
飓风主力中场弗里茨勒（9场1球）本轮累计黄牌禁赛。（一路追球）
</p>

<p class="abstract abstract-nophoto">
飓风解放者杯小组成功出线，但联赛最近有所起伏，他们的近3个客场没有胜绩（1平2负），目前排名B组第4，已经被榜首球队拉开了8分差距。（一路追球）
</p>

对应代码

div.xpath('p[starts-with(@class,"abstract")]/text()').extract_first().strip()

div.xpath('p[contains(@class,"abstract")]/text()').extract_first().strip()

码农公寓