python之pandas模块高级用法

一 agg,聚合,可以使用内置的函数

>>> import pandas as pd
>>> import numpy as np
>>> pp = pd.DataFrame(np.random.randn(10, 3), columns=['A', 'B', 'C'],index=pd.date_range('1/1/2000', periods=10))
>>> pp
                   A         B         C
2000-01-01  0.754524 -0.855136  0.135573
2000-01-02  0.224428 -2.025685  0.590259
2000-01-03 -0.894270  1.956547 -0.515041
2000-01-04  0.794662  0.005409 -1.846422
2000-01-05  0.808849  1.283276 -0.681725
2000-01-06  0.538258 -0.249534  0.217653
2000-01-07  0.582666 -0.656912 -0.780406
2000-01-08 -0.981985  1.125303  0.230330
2000-01-09  1.303636  0.806432  0.556127
2000-01-10 -1.207910  2.382836  0.959141
>>> pp.iloc[3:7]=np.nan   #直接给赋值
>>> pp
                   A         B         C
2000-01-01  0.754524 -0.855136  0.135573
2000-01-02  0.224428 -2.025685  0.590259
2000-01-03 -0.894270  1.956547 -0.515041
2000-01-04       NaN       NaN       NaN
2000-01-05       NaN       NaN       NaN
2000-01-06       NaN       NaN       NaN
2000-01-07       NaN       NaN       NaN
2000-01-08 -0.981985  1.125303  0.230330
2000-01-09  1.303636  0.806432  0.556127
2000-01-10 -1.207910  2.382836  0.959141
>>> pp.agg(np.sum) #使用方法一
A   -0.801575
B    3.390298
C    1.956388
dtype: float64
>>> pp.agg('sum') #使用方法二
A   -0.801575
B    3.390298
C    1.956388
dtype: float64
>>> pp.A.agg('sum') #给当个列使用
-0.8015753184519548

>>> tsdf.agg({'A':['mean','sum'],'B':'sum'}) #分别对列进行多个或单个函数计算
A B
mean -0.133596 NaN
sum -0.801575 3.390298

 

上一篇:SQL学习之数据转换--行列转换_2


下一篇:低代码指南100方案:12世界五百强企业是如何有效落地精益生产的?