我有两个点x和y的向量,分别为(n,p)和(m,p).举个例子:
x = np.array([[ 0. , -0.16341, 0.98656],
[-0.05937, -0.25205, 0.96589],
[ 0.05937, -0.25205, 0.96589],
[-0.11608, -0.33488, 0.93508],
[ 0. , -0.33416, 0.94252]])
y = np.array([[ 0. , -0.36836, 0.92968],
[-0.12103, -0.54558, 0.82928],
[ 0.12103, -0.54558, 0.82928]])
我想计算一个(n,m)大小的矩阵,其中包含两点之间的角度,la this问题.也就是说,是以下内容的向量化版本:
theta = np.array(
[ np.arccos(np.dot(i, j) / (la.norm(i) * la.norm(j)))
for i in x for j in y ]
).reshape((n, m))
注意:n和m可以分别约为10000.
解决方法:
有多种方法可以做到这一点:
import numpy.linalg as la
from scipy.spatial import distance as dist
# Manually
def method0(x, y):
dotprod_mat = np.dot(x, y.T)
costheta = dotprod_mat / la.norm(x, axis=1)[:, np.newaxis]
costheta /= la.norm(y, axis=1)
return np.arccos(costheta)
# Using einsum
def method1(x, y):
dotprod_mat = np.einsum('ij,kj->ik', x, y)
costheta = dotprod_mat / la.norm(x, axis=1)[:, np.newaxis]
costheta /= la.norm(y, axis=1)
return np.arccos(costheta)
# Using scipy.spatial.cdist (one-liner)
def method2(x, y):
costheta = 1 - dist.cdist(x, y, 'cosine')
return np.arccos(costheta)
# Realize that your arrays `x` and `y` are already normalized, meaning you can
# optimize method1 even more
def method3(x, y):
costheta = np.einsum('ij,kj->ik', x, y) # Directly gives costheta, since
# ||x|| = ||y|| = 1
return np.arccos(costheta)
(n,m)=(1212,252)的计时结果:
>>> %timeit theta = method0(x, y)
100 loops, best of 3: 11.1 ms per loop
>>> %timeit theta = method1(x, y)
100 loops, best of 3: 10.8 ms per loop
>>> %timeit theta = method2(x, y)
100 loops, best of 3: 12.3 ms per loop
>>> %timeit theta = method3(x, y)
100 loops, best of 3: 9.42 ms per loop
时序差异随着元素数量的增加而减小.对于(n,m)=(6252,1212):
>>> %timeit -n10 theta = method0(x, y)
10 loops, best of 3: 365 ms per loop
>>> %timeit -n10 theta = method1(x, y)
10 loops, best of 3: 358 ms per loop
>>> %timeit -n10 theta = method2(x, y)
10 loops, best of 3: 384 ms per loop
>>> %timeit -n10 theta = method3(x, y)
10 loops, best of 3: 314 ms per loop
但是,如果您省略了np.arccos步骤,即假设您可以只用costheta进行管理,而无需theta本身,则:
>>> %timeit costheta = np.einsum('ij,kj->ik', x, y)
10 loops, best of 3: 61.3 ms per loop
>>> %timeit costheta = 1 - dist.cdist(x, y, 'cosine')
10 loops, best of 3: 124 ms per loop
>>> %timeit costheta = dist.cdist(x, y, 'cosine')
10 loops, best of 3: 112 ms per loop
对于(6252,1212)的情况.因此,实际上np.arccos占用了80%的时间.在这种情况下,我发现np.einsum比dist.cdist快得多.因此,您绝对希望使用einsum.
简介:theta的结果在很大程度上相似,但是np.einsum对我而言最快,尤其是当您没有多余地计算标准时.尝试避免计算theta并仅使用costheta.
注意:我没有提到的重要一点是,浮点精度的有限会导致np.arccos给出nan值.方法[0:3]自然可以处理未正确归一化的x和y值.但是method3给出了一些难点.我用预归一化解决了这个问题,这会自然破坏使用method3的任何收益,除非您需要为一小套预归一化的矩阵多次执行此计算(无论出于何种原因).