我试图显示两个单独列表之间的相关性.在安装Numpy之前,我分析了世界银行的数据以获取GDP值和互联网用户数量,并将它们存储在两个单独的列表中.这是代码片段.这仅适用于gdp07.实际上,我有更多的清单可以列出更多的年份以及其他数据,例如失业率.
import numpy as np
file = open('final_gdpnum.txt', 'r')
gdp07 = []
for line in file:
fields = line.strip().split()
gdp07.append(fields [0])
file2 = open('internetnum.txt', 'r')
netnum07 = []
for line in file2:
fields2 = line.strip().split()
nnetnum07.append(fields2 [0])
print np.correlate(gdp07,netnum07,"full")
我得到的错误是这样的:
Traceback (most recent call last):
File "Project3,py", line 83, in ,module.
print np.correlate(gdp07, netnum07, "full")
File "/usr/lib/python2.6/site-packages/numpy/core/numeric.py", line 645, in correlate
return multiarray.correlate2(a,v,mode))
ValueError: data type must provide an itemsize
仅作记录,我在Windows计算机上使用Cygwin和Python 2.6.我只使用Numpy及其依赖项和其构建的其他部分(gcc编译器).任何帮助都会很棒.谢谢
解决方法:
当您尝试将数据输入为字符串时,这可能是错误,因为根据python docs strip()返回字符串
http://docs.python.org/library/stdtypes.html
尝试将数据解析为所需的任何类型
如您所见
In [14]:np.correlate(["3", "2","1"], [0, 1, 0.5])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/home/dog/<ipython-input-14-a0b588b9af44> in <module>()
----> 1 np.correlate(["3", "2","1"], [0, 1, 0.5])
/usr/lib64/python2.7/site-packages/numpy/core/numeric.pyc in correlate(a, v, mode, old_behavior)
643 return multiarray.correlate(a,v,mode)
644 else:
--> 645 return multiarray.correlate2(a,v,mode)
646
647 def convolve(a,v,mode='full'):
ValueError: data type must provide an itemsize
尝试解析值
In [15]: np.correlate([int("3"), int("2"),int("1")], [0, 1, 0.5])
Out[15]: array([ 2.5])
import numpy as np
file = open('final_gdpnum.txt', 'r')
gdp07 = []
for line in file:
fields = line.strip().split()
gdp07.append(int(fields [0]))
file2 = open('internetnum.txt', 'r')
netnum07 = []
for line in file2:
fields2 = line.strip().split()
nnetnum07.append(int(fields2 [0]))
print np.correlate(gdp07,netnum07,"full")
您的另一个错误是字符结尾问题
我希望这能奏效,因为我有一个默认情况下支持utf-8的linux盒,因为我认为我无法重现它.
我去过ipython help(codecs)文档
http://code.google.com/edu/languages/google-python-class/dict-files.html
import codecs
f = codecs.open(file, "r", codecs.BOM_UTF8)
for line in f:
fields = line.strip().split()
gdp07.append(int(fields [0]))