前言
抓取网页数据时使用HtmlAgilityPack分析,需要通过xpath定位页面元素。如果有个xpath的生成和验证工具就事半功倍了,火狐浏览器插件FirePath配合Firebug就能完美实现。
FirePath介绍如下:
FirePath is a Firebug extension that adds a development tool to edit, inspect and generate XPath 1.0 expressions, CSS 3 selectors and JQuery selectors.With FirePath you can:
* Edit XPath expressions, CSS3 selectors and JQuery selectors (Sizzle selector engine) with auto completion for XPath (using TAB or up and down arrows).
* Evaluate the expression/selector on any HTML or XML documents.
* Display the result of evaluations in a Firebug-like DOM tree.
* Highlight the results directly on the document displayed by Firefox (works only with HTML documents).
* Generate an XPath expression or a CSS selector for an element by right clicking on it and selecting "Inspect in FirePath" in the context menu.
* Define the evaluation context (parent) of an expression/selector.
* Choose the document in which to evaluate the expression/selector (only applicable for HTML documents with frames or iframes).FirePath 0.9.7 requires Firefox 3.5 to 6.0 and Firebug 1.4 to 1.8.*
Note that the XPath auto completion does not work with Firebug 1.6 and higher
实战
安装完firebug和firepath插件后,打开博客园首页,运行firebug,如下图:
真正的所见即所得,这时页面变成了: