numpy3.统计相关
次序统计
最小值
np.amin(a, axis=None, out=None, keepdims=np._NoValue, initial=np._NoValue,where=np._NoValue)
最大值
np.amax(a, axis=None, out=None, keepdims=np._NoValue, initial=np._NoValue,where=np._NoValue)
极差
peak to peaknumpy.ptp(a, axis=None, out=None, keepdims=np._NoValue)
分位数
a为array,q为分位数值(0-100),q可以为数组此时取多个分位数,axis范围0或1。返回第q%小的数。np.percentile(a, q, axis=None, out=None, overwrite_input=False,interpolation='linear', keepdims=False)
均值与方差
中位数
np.median(a, axis=None, out=None, overwrite_input=False, keepdims=False)
平均值
np.mean(a, axis=None, dtype=None, out=None, keepdims=np._NoValue))
加权平均值
weights为权值数组。np.average(a, axis=None, weights=None, returned=False)
方差
方差分母为n,样本方差无偏估计分母为n-1
ddof:*度的个数。np.var(a, axis=None, dtype=None, out=None, ddof=0, keepdims=np._NoValue)
标准差
np.std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=np._NoValue)
相关
协方差矩阵
协方差直观含义为两个变量总体误差的期望,描述两个变量协同变化的程度:
Cov(X, Y)
= E[(X-E(X)) * (Y-E(Y))]
= E[XY] - 2E[Y] * E[X] + E[X] * E[Y]
= E[XY] - E[X] * E[Y]
Cov(X, X) 即得到 X 的方差
下方函数只给出m计算方差,给出m和y计算协方差矩阵(对称):np.cov(m, y=None, rowvar=True, bias=False, ddof=None, fweights=None,aweights=None)
相关系数
实际上是正则化的协方差,描述变量的相关性,n个变量的相关系数形成一个n维方阵。返回相关系数矩阵(对称)。np.corrcoef(x, y=None, rowvar=True, bias=np._NoValue, ddof=np._NoValue)
直方图
bins:一维单调数组,有序排列,代表横坐标。right:间隔是否包含最右。返回x在bin中的位置。np.digitize(x, bins, right=False)
我是代码:
import numpy as np
np.random.seed(926734542)
x = np.random.randint(0, 9, 9).reshape((3, 3))
print(x)
xmin = np.amin(x, axis=0)
print(xmin)
xmax = np.amax(x, axis=1)
print(xmax)
peak = np.ptp(x, axis=0)
print(peak)
perc = np.percentile(x, 50, axis=0)
print(perc)
medi = np.median(x, axis=0)
print(medi)
mean = np.mean(x, axis=0)
print(mean)
aver = np.average(x, axis=0, weights=[0, 1, 2])
print(aver)
var = np.var(x)
print(var)
std = np.std(x)
print(std)
np.random.seed(5426)
y = np.random.randint(0, 9, 9).reshape((3, 3))
cov = np.cov(x, y)
print(cov)
corr = np.corrcoef(x, y)
print(corr)
z = np.arange(0, 9)
digi = np.digitize(x, z, True)
print(digi)