ValueError: You are trying to merge on object and int64 columns

错误如下:

ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat

原代码:

pca = pd.read_csv("I:/1000GenomeProject/result/plink.eigenvec"," ",header=None)
ped = pd.read_csv("I:/1000GenomeProject/result/20130606_g1k.ped","\\t",engine='python')
#重命名pca文件的列
pca = pca.rename(columns=dict([(1,"Individual ID")]+[(x,"PC"+str(x-1)) for x in range(2,22)]))

#join两个数据表
#报错写法
pcaped=pca.join(ped,on="Individual ID",how="inner")

#正确写法
pcaped = pca.set_index('Individual ID').join(ped.set_index('Individual ID'),how="inner")

解决方案:

It can occur in two scenarios:

When using the join method: you are probably joining DataFrames on labels and not on indices.
When using the merge method: you are probably joining DataFrames on two columns that are not of the same type.

You are trying to join on labels and not on indices using the join method
This is an example that generates the error:

data_x.join(data_y, on='key')

In the first scenario, you can edit your code to join on the index. In the following code, I set the index on the columns I want to join.

data_x.set_index('key').join(data_y.set_index('key'))

参考:
https://www.roelpeters.be/pandas-solve-you-are-trying-to-merge-on-object-and-int64-columns/

上一篇:算法-排序(下)


下一篇:Git&Github常用知识整理(2)