There are two major differences between the
transform
andapply
groupby methods.
apply
implicitly passes all the columns for each group as a DataFrame to the custom function, whiletransform
passes each column for each group as a Series to the custom function- The custom function passed to
apply
can return a scalar, or a Series or DataFrame (or numpy array or even list). The custom function passed totransform
must return a sequence (a one dimensional Series, array or list) the same length as the group.(transform必须返回与组合相同长度的序列(一维的序列、数组或列表))So,
transform
works on just one Series at a time andapply
works on the entire DataFrame at once.
from :https://*.com/questions/27517425/apply-vs-transform-on-a-group-object#
transform 函数:
1.只允许在同一时间在一个Series上进行一次转换,如果定义列‘a’ 减去列‘b’, 则会出现异常;
2.必须返回与 group相同的单个维度的序列(行)
3. 返回单个标量对象也可以使用,如 . transform(sum)
apply函数:
1. 不同于transform只允许在Series上进行一次转换, apply对整个DataFrame 作用
2.apply隐式地将group 上所有的列作为自定义函数
栗子:
返回单个标量可以使用transform:
:我们可以看到使用transform 和apply 的输出结果形式是不一样的,transform返回与数据同样长度的行,而apply则进行了聚合
此时,使用apply说明的信息更明确
The other difference is that
transform
must return a single dimensional sequence the same size as the group. In this particular instance, each group has two rows, sotransform
must return a sequence of two rows. If it does not then an error is raised:
栗子2:
The function passed to
transform
must return a number, a row, or the same shape as the argument. if it‘s a number then the number will be set to all the elements in the group, if it‘s a row, it will be broadcasted to all the rows in the group.函数传递给
transform
必须返回一个数字,一行,或者与参数相同的形状。 如果是一个数字,那么数字将被设置为组中的所有元素,如果是一行,它将会被广播到组中的所有行。
参考:https://*.com/questions/27517425/apply-vs-transform-on-a-group-object#