pandas DataFrame进行向量化运算时,是根据行和列的索引值进行计算的,而不是行和列的位置:
1. 行和列索引一致:
import pandas as pd
df1 = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6], 'c': [7, 8, 9]})
df2 = pd.DataFrame({'a': [10, 20, 30], 'b': [40, 50, 60], 'c': [70, 80, 90]})
print df1 + df2
a b c
0 11 44 77
1 22 55 88
2 33 66 99
2. 行索引一致,列索引不一致:
df1 = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6], 'c': [7, 8, 9]})
df2 = pd.DataFrame({'d': [10, 20, 30], 'c': [40, 50, 60], 'b': [70, 80, 90]})
print df1 + df2
a b c d
0 NaN 74 47 NaN
1 NaN 85 58 NaN
2 NaN 96 69 NaN
没有对应索引的值,会用空来代替进行计算
3. 行索引不一致,列索引一致:
df1 = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6], 'c': [7, 8, 9]},
index=['row1', 'row2', 'row3'])
df2 = pd.DataFrame({'a': [10, 20, 30], 'b': [40, 50, 60], 'c': [70, 80, 90]},
index=['row4', 'row3', 'row2'])
print df1 + df2
a b c
row1 NaN NaN NaN
row2 32.0 65.0 98.0
row3 23.0 56.0 89.0
row4 NaN NaN NaN
其实总结下来就是,行列索引相同的,进行计算,没有的全部用空进行计算