Python RSS Parser也处理FeedBurner

2023-07-22 10:06:46

我正在为RSS feed编写Python解析器脚本.我正在使用feedparser,但是,我一直在解析FeedBurner的Feed.谁现在需要FeedBurner？无论如何..

例如,我找不到解析的方法

http://feeds.wired.com/wired/index

http://feeds2.feedburner.com/ziffdavis/pcmag

当我把它们放入feedparser库时,似乎不起作用.
尝试在URL的末尾放置？fmt = xml或？format = xml,但仍然没有得到xml格式.

我是否需要使用诸如BeautifulSoup的html解析器来解析FeedBurner提要？最好是有一个python公共解析器或聚合器脚本来处理这个吗？

任何提示或帮助将不胜感激.

解决方法:

您可能遇到版本问题或者您错误地使用了API – 这有助于查看您的错误消息.例如,以下适用于Python 2.7和feedparser 5.0.1：

>>> import feedparser
>>> url = 'http://feeds2.feedburner.com/ziffdavis/pcmag'
>>> d = feedparser.parse(url)
>>> d.feed.title
u'PCMag.com: New Product Reviews'
>>> d.feed.link
u'http://www.pcmag.com'
>>> d.feed.subtitle
u"First Look At New Products From PCMag.com including Lab Tests, Ratings, Editor's and User's Reviews."
>>> len(d['entries'])
30
>>> d['entries'][0]['title']
u'Canon Color imageClass MF9280cdn'

并使用其他网址：

>>> url = 'http://feeds.wired.com/wired/index'
>>> d = feedparser.parse(url)
>>> d.feed.title
u'Wired Top Stories'
>>> d.feed.link
u'http://www.wired.com/rss/index.xml'
>>> d.feed.subtitle
u'Top Stories<img src="http://www.wired.com/rss_views/index.gif" />'
>>> len(d['entries'])
30
>>> d['entries'][0]['title']
u'Heart of Dorkness: LARPing Goes Haywire in <em>Wild Hunt</em>'

码农公寓

相关文章