我是Python新手,还是SO新手.
我有一个名为df的熊猫数据框,看起来像:
Text
Date Location
2015-07-08 San Diego, CA 1
2015-07-07 Bellevue, WA 1
Los Angeles, CA 1
New York, NY 1
Los Angeles, CA 1
Unknown 1
我想使用以下数据透视表:
import pandas, numpy as np
df_pivoted = df.pivot_table(df, values=['Text'], index=['Date'],
columns=['Location'],aggfunc=np.sum)
想法是生成一个热图,以按“位置”和“日期”显示“文本”的计数.
我得到错误:
TypeError: pivot_table() got multiple values for keyword argument 'values'
使用简化方法时:
df = df.pivot_table('Date', 'Location', 'Text')
我得到错误:
raise DataError('No numeric types to aggregate')
我正在使用Python 2.7和Pandas 0.16.2
In[2]: df.dtypes
Out[2]:
Date datetime64[ns]
Text object
Location object
dtype: object
有人有主意吗?
解决方法:
import pandas as pd
import numpy as np
# just try to replicate your dataframe
# ==============================================
date = ['2015-07-08', '2015-07-07', '2015-07-07', '2015-07-07', '2015-07-07', '2015-07-07']
location = ['San Diego, CA', 'Bellevue, WA', 'Los Angeles, CA', 'New York, NY', 'Los Angeles, CA', 'Unknown']
text = [1] * 6
df = pd.DataFrame({'Date': date, 'Location': location, 'Text': text})
Out[141]:
Date Location Text
0 2015-07-08 San Diego, CA 1
1 2015-07-07 Bellevue, WA 1
2 2015-07-07 Los Angeles, CA 1
3 2015-07-07 New York, NY 1
4 2015-07-07 Los Angeles, CA 1
5 2015-07-07 Unknown 1
# processing
# ==============================================
pd.pivot_table(df, index='Date', columns='Location', values='Text', aggfunc=np.sum)
Out[142]:
Location Bellevue, WA Los Angeles, CA New York, NY San Diego, CA Unknown
Date
2015-07-07 1 2 1 NaN 1
2015-07-08 NaN NaN NaN 1 NaN