(1)找不到数据文件错误
Errors more Resource u'tokenizers/punkt/english.pickle' not found. Please use the NLTK Downloader to obtain the resource: >>> nltk.download() Searched in: - '/var/www/nltk_data' - '/usr/share/nltk_data' - '/usr/local/share/nltk_data' - '/usr/lib/nltk_data' - '/usr/local/lib/nltk_data' - u' Traceback (most recent call last): File "/var/www/CSCE-470-Anime-Recommender/py/app.py", line 40, in <module> cl = NaiveBayesClassifier(Functions.classify(UserData)) File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 192, in __init__ self.train_features = [(self.extract_features(d), c) for d, c in self.train_set] File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 169, in extract_features return self.feature_extractor(text, self.train_set) File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 81, in basic_extractor word_features = _get_words_from_dataset(train_set) File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 63, in _get_words_from_dataset return set(all_words) File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 62, in <genexpr> all_words = chain.from_iterable(tokenize(words) for words, _ in dataset) File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 59, in tokenize return word_tokenize(words, include_punc=False) File "/usr/local/lib/python2.7/dist-packages/textblob/tokenizers.py", line 72, in word_tokenize for sentence in sent_tokenize(text)) File "/usr/local/lib/python2.7/dist-packages/textblob/base.py", line 64, in itokenize return (t for t in self.tokenize(text, *args, **kwargs)) File "/usr/local/lib/python2.7/dist-packages/textblob/decorators.py", line 38, in decorated raise MissingCorpusError() MissingCorpusError: Looks like you are missing some required data for this feature. To download the necessary data, simply run python -m textblob.download_corpora or use the NLTK downloader to download the missing data: http://nltk.org/data.html If this doesn't fix the problem, file an issue at https://github.com/sloria/TextBlob/issues.
我本地没有taggers/averaged_perceptron_tagger/averaged_perceptron_tagger.pickle这个文件,打开本地nltk_data,发现还真是,只有下载了
解决方法:使用nltk下载
nltk.download()
下载过程中会有个弹窗,要自己选择下载的文件,在Models里第一个averaged_perceptron_tagger,然后点击下载,如果网络环境比较好的话,很快就可以下载完成了。
(2)翻译问题
textblob 的翻译程序在 /usr/lib/python2.7/site-packages/textblob/translate.py
他主要是使用了google的翻译,代码中的链接为
url = "http://translate.google.com/translate_a/t"
所以,国内是访问不料这个网址的,所以就翻译不了