我一直假定scipy.linalg.norm()和numpy.linalg.norm()是等效的(scipy版本以前不接受轴参数,但现在可以了).但是,以下简单示例会产生明显不同的性能:其背后的原因是什么?
In [1]: from scipy.linalg import norm as normsp
In [2]: from numpy.linalg import norm as normnp
In [3]: import numpy as np
In [4]: a = np.random.random(size=(1000, 2000))
In [5]: %timeit normsp(a)
The slowest run took 5.69 times longer than the fastest. This could mean that an intermediate result is being cached.
100 loops, best of 3: 2.85 ms per loop
In [6]: %timeit normnp(a)
The slowest run took 6.39 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 558 µs per loop
scipy版本是0.18.1,numpy是1.11.1
解决方法:
查看source code可以发现scipy具有其自己的norm函数,该函数围绕numpy.linalg.norm或较慢但可更好地处理浮点溢出的BLAS函数(请参见此PR的讨论).
但是,在您给出的示例中,SciPy看起来不像是在使用BLAS函数,因此我认为这与您看到的时差无关.但是scipy会在调用numpy版本的norm之前进行其他检查.特别是,无限检查a = np.asarray_chkfinite(a)可能导致性能差异:
In [103]: %timeit normsp(a)
100 loops, best of 3: 5.1 ms per loop
In [104]: %timeit normnp(a)
1000 loops, best of 3: 744 µs per loop
In [105]: %timeit np.asarray_chkfinite(a)
100 loops, best of 3: 4.13 ms per loop
因此,看起来np.asarray_chkfinite大致考虑了评估规范所需的时间差异.