我的数据框看起来像:
SK_ID_CURR CREDIT_ACTIVE
0 215354 Closed
1 215354 Active
2 215354 Active
3 215354 Active
4 215354 Active
5 215354 Active
6 215354 Active
7 162297 Closed
8 162297 Closed
9 162297 Active
我想为每个id聚合活动和已结束信用的数量,然后为Active_credits创建一个新列,Closed_credits,其中包含每个id的相应活动和已关闭信用的数量.
解决方法:
您可以使用pandas.crosstab
,这可以避免您建议的中间步骤:
res = pd.crosstab(df['SK_ID_CURR'], df['CREDIT_ACTIVE'])
print(res)
CREDIT_ACTIVE Active Closed
SK_ID_CURR
162297 1 2
215354 6 1