当我想从Python代码向SQLite数据库插入字符串时,出现此错误:
sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless
you use a text_factory that can interpret 8-bit bytestrings (like
text_factory = str). It is highly recommended that you instead just
switch your application to Unicode strings.
这是插入语句:
cur.execute("insert into links (url, title, ...) values (:url, :title, ...)", locals())
该字符串如下出现:
soup = BeautifulSoup(html.read(), fromEncoding="utf-8")
html.close()
for i in soup.findAll('a'):
url = i['href']
title = i.renderContents()
您能建议我如何将字符串插入SQLite数据库吗?
编辑:我发现插入另一个表时,URL字符串是可以的.网址字符串的类型为unicode.问题是插入标题字符串时.标题字符串的类型为str.
我试过了:
title = unicode(i.renderContents())
但这以错误结尾:
UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xc3 in position
44: ordinal not in range(128)
谢谢
解决方法:
虽然并非严格要求使用url,但您可以将其存储为Unicode.
BeautifulSoup使用Unicode.
>>> from BeautifulSoup import BeautifulSoup
>>> soup = BeautifulSoup("""<a href="ascii">""", fromEncoding="utf-8")
>>> isinstance(soup('a', href=True)[0]['href'], unicode)
True
>>> soup = BeautifulSoup("""<a href="αβγ">""", fromEncoding="utf-8")
>>> soup('a', href=True)[0]['href']
u'\u03b1\u03b2\u03b3'
在这两种情况下,URL都是unicode.
您可以调用isinstance()或type()来确定URL的类型.
您可以指定encoding = None来获取Unicode:
i.renderContents(encoding=None)
通常,在交互式Python控制台中使用dir(obj),help(obj.method)可能会有所帮助.另请参见Printing Document.