笔记:矩阵行列式求导以及矩阵的逆的求导

笔记:矩阵行列式求导以及矩阵的逆的求导

一、结论

设 A = ( a i j ( t ) ) n × n A=(a_{ij}(t))_{n\times n} A=(aij​(t))n×n​,
d ∣ A ∣ d t = ∣ A ∣ t r ( A − 1 d A d t ) d A − 1 d t = − A − 1 d A d t A − 1 \begin{aligned} &\frac{d |A|}{dt} =|A|tr\left(A^{-1}\frac{dA}{dt}\right)\\ &\frac{d A^{-1}}{dt} = -A^{-1} \frac{d A}{dt} A^{-1} \end{aligned} ​dtd∣A∣​=∣A∣tr(A−1dtdA​)dtdA−1​=−A−1dtdA​A−1​
以下两节给一个简要的推导,可能并不严格。

二、矩阵的行列式求导

根据链式法则,
d ∣ A ∣ d t = ∑ i ∑ j ∂ ∣ A ∣ ∂ a i j d a i j d t \frac{d |A|}{dt} = \sum_{i}\sum_{j}\frac{\partial |A|}{\partial a_{ij}}\frac{d a_{ij}}{dt} dtd∣A∣​=i∑​j∑​∂aij​∂∣A∣​dtdaij​​我们注意到,
∂ ∣ A ∣ ∂ a i j = lim ⁡ ε → 0 ∣ A + ε I i j ∣ − ∣ A ∣ ε = lim ⁡ ε → 0 ε A i j ε = A i j \frac{\partial |A|}{\partial a_{ij}} = \lim_{\varepsilon \rightarrow 0} \frac{|A + \varepsilon I_{ij}| - |A|}{\varepsilon} = \lim_{\varepsilon \rightarrow 0} \frac{\varepsilon A_{ij}}{\varepsilon} = A_{ij} ∂aij​∂∣A∣​=ε→0lim​ε∣A+εIij​∣−∣A∣​=ε→0lim​εεAij​​=Aij​其中, I i j I_{ij} Iij​为只有 ( i , j ) (i,j) (i,j)元是1,其他都是0的矩阵, A i j A_{ij} Aij​表示矩阵 A A A在 ( i , j ) (i,j) (i,j)处的代数余子式。于是链式法则求得的式子变为如下形式,
d ∣ A ∣ d t = ∑ i ∑ j A i j d a i j d t \frac{d |A|}{dt} = \sum_{i}\sum_{j}A_{ij}\frac{d a_{ij}}{dt} dtd∣A∣​=i∑​j∑​Aij​dtdaij​​又由,
d a i j d t = ( d A d t ) i j A ∗ A = ∣ A ∣ I n ⇒ A ∗ = ∣ A ∣ A − 1 ⇒ A i j = ∣ A ∣ ( A − 1 ) j i \begin{aligned} &\frac{da_{ij}}{dt} = \left(\frac{dA}{dt}\right)_{ij} \\ &A^*A = |A|I_n \Rightarrow A^* = |A|A^{-1}\Rightarrow A_{ij} = |A|\left(A^{-1}\right)_{ji} \end{aligned} ​dtdaij​​=(dtdA​)ij​A∗A=∣A∣In​⇒A∗=∣A∣A−1⇒Aij​=∣A∣(A−1)ji​​其中 A ∗ A^* A∗为 A A A的伴随矩阵,则有,
d ∣ A ∣ d t = ∑ i ∑ j A i j d a i j d t = ∑ j ∑ i ∣ A ∣ ( A − 1 ) j i ( d A d t ) i j = ∣ A ∣ ∑ j ( A − 1 d A d t ) j j = ∣ A ∣ t r ( A − 1 d A d t ) \begin{aligned} \frac{d |A|}{dt} &= \sum_{i}\sum_{j}A_{ij}\frac{d a_{ij}}{dt}\\ &=\sum_{j}\sum_{i}|A|\left(A^{-1}\right)_{ji}\left(\frac{dA}{dt}\right)_{ij}\\ &= |A|\sum_{j}\left(A^{-1}\frac{dA}{dt}\right)_{jj}\\ &=|A|tr\left(A^{-1}\frac{dA}{dt}\right) \end{aligned} dtd∣A∣​​=i∑​j∑​Aij​dtdaij​​=j∑​i∑​∣A∣(A−1)ji​(dtdA​)ij​=∣A∣j∑​(A−1dtdA​)jj​=∣A∣tr(A−1dtdA​)​

三、矩阵的逆的导数

首先,
0 n × n = ∂ I n ∂ t = ∂ A A − 1 ∂ t = ∂ A ∂ t A − 1 + A ∂ A − 1 ∂ t \mathbf{0}_{n\times n} = \frac{\partial I_n}{\partial t} = \frac{\partial AA^{-1}}{\partial t} = \frac{\partial A}{\partial t}A^{-1} + A\frac{\partial A^{-1}}{\partial t} 0n×n​=∂t∂In​​=∂t∂AA−1​=∂t∂A​A−1+A∂t∂A−1​则有,
d A − 1 d t = − A − 1 d A d t A − 1 \frac{d A^{-1}}{dt} = -A^{-1} \frac{d A}{dt} A^{-1} dtdA−1​=−A−1dtdA​A−1

参考资料:

  1. 行列式的导数 by 苏剑林.
  2. 逆矩阵求导.
上一篇:SKIP CONNECTIONS MATTER: ON THE TRANSFERABILITY OF ADVERSARIAL EXAMPLES GENERATED WITH RESNETS 翻译,侵删


下一篇:线性回归详解 从零开始 从理论到实践