从Python代码将字符串插入SQLite数据库时出错

当我想从Python代码向SQLite数据库插入字符串时,出现此错误:

sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless
you use a text_factory that can interpret 8-bit bytestrings (like
text_factory = str). It is highly recommended that you instead just
switch your application to Unicode strings.

这是插入语句:

cur.execute("insert into links (url, title, ...) values (:url, :title, ...)", locals())

该字符串如下出现:

soup = BeautifulSoup(html.read(), fromEncoding="utf-8")
html.close()
for i in soup.findAll('a'):
  url = i['href']
  title = i.renderContents()

您能建议我如何将字符串插入SQLite数据库吗?

编辑:我发现插入另一个表时,URL字符串是可以的.网址字符串的类型为unicode.问题是插入标题字符串时.标题字符串的类型为str.

我试过了:

title = unicode(i.renderContents())

但这以错误结尾:

UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xc3 in position
44: ordinal not in range(128)

谢谢

解决方法:

虽然并非严格要求使用url,但您可以将其存储为Unicode.

BeautifulSoup使用Unicode.

>>> from BeautifulSoup import BeautifulSoup
>>> soup = BeautifulSoup("""<a href="ascii">""", fromEncoding="utf-8")
>>> isinstance(soup('a', href=True)[0]['href'], unicode)
True

>>> soup = BeautifulSoup("""<a href="αβγ">""", fromEncoding="utf-8")
>>> soup('a', href=True)[0]['href']
u'\u03b1\u03b2\u03b3'

在这两种情况下,URL都是unicode.

您可以调用isinstance()或type()来确定URL的类型.

您可以指定encoding = None来获取Unicode:

i.renderContents(encoding=None)

通常,在交互式Python控制台中使用dir(obj),help(obj.method)可能会有所帮助.另请参见Printing Document.

上一篇:四 . 爬虫 BeautifulSoup库参数和使用


下一篇:UnicodeDammit:Detwingle在网站上崩溃