函数对矩阵以及函数矩阵对矩阵求导,我理解主要就是一种简化的写法,用矩阵将多个多元函数对每个元求导写成矩阵的形式,看起来比较简洁。
函数对矩阵的导数
设矩阵 X = ( ξ i j ) m × n \mathbf{X}=({\xi_{ij}})_{m\times n} X=(ξij)m×n, m n mn mn元函数 f ( X ) = f ( ξ 11 , ξ 12 , ξ 13 , … , ξ m 1 , … , ξ m n ) f(\mathbf{X})=f(\xi_{11},\xi_{12},\xi_{13},\dots,\xi_{m1},\dots,\xi_{mn}) f(X)=f(ξ11,ξ12,ξ13,…,ξm1,…,ξmn),则 f ( X ) f(\mathbf{X}) f(X)对矩阵 X \mathbf{X} X的导数为,
d f d X = ( ∂ f ∂ ξ i j ) m × n = [ ∂ f ∂ ξ 11 ∂ f ∂ ξ 12 … ∂ f ∂ ξ 1 n ⋮ ⋮ ⋮ ∂ f ∂ ξ m 1 ∂ f ∂ ξ m 2 … ∂ f ∂ ξ m n ] \dfrac {df}{d\mathbf{X}}=\left( \dfrac {\partial f}{\partial \xi _{ij}}\right) _{m\times n}=\begin{bmatrix} \dfrac {\partial f}{\partial \xi_{11}} & \dfrac {\partial f}{\partial \xi_{12}} & \ldots & \dfrac {\partial f}{\partial \xi_{1n}} \\ \vdots & \vdots & & \vdots \\ \dfrac {\partial f}{\partial \xi_{m1}} & \dfrac {\partial f}{\partial \xi_{m2}} & \ldots & \dfrac {\partial f}{\partial \xi_{mn}} \end{bmatrix} dXdf=(∂ξij∂f)m×n=⎣⎢⎢⎢⎢⎡∂ξ11∂f⋮∂ξm1∂f∂ξ12∂f⋮∂ξm2∂f……∂ξ1n∂f⋮∂ξmn∂f⎦⎥⎥⎥⎥⎤
例如,
X
=
(
ξ
1
,
ξ
2
,
…
,
ξ
n
)
T
\mathbf{X}=(\xi_1,\xi_2,\dots,\xi_n)^{T}
X=(ξ1,ξ2,…,ξn)T,
n
n
n元函数
f
(
X
)
=
f
(
ξ
1
,
ξ
2
,
…
,
ξ
n
)
f(\mathbf{X})=f(\xi_1,\xi_2,\dots,\xi_n)
f(X)=f(ξ1,ξ2,…,ξn),
则,
d
f
d
X
=
(
d
f
d
ξ
1
,
d
f
d
ξ
2
,
…
,
d
f
ξ
n
)
T
\dfrac{df}{d\mathbf{X}}=(\dfrac{df}{d\xi_1},\dfrac{df}{d\xi_2},\dots,\dfrac{df}{\xi_n})^{T}
dXdf=(dξ1df,dξ2df,…,ξndf)T
又有,
d
f
d
X
T
=
(
d
f
d
ξ
1
,
d
f
d
ξ
2
,
…
,
d
f
ξ
n
)
\dfrac{df}{d{\mathbf{X}}^{T}}=(\dfrac{df}{d\xi_1},\dfrac{df}{d\xi_2},\dots,\dfrac{df}{\xi_n})
dXTdf=(dξ1df,dξ2df,…,ξndf)
函数矩阵对矩阵的导数
设矩阵
X
=
(
ξ
i
j
)
m
×
n
\mathbf{X}=({\xi_{ij}})_{m\times n}
X=(ξij)m×n,
m
n
mn
mn元函数
f
i
j
(
X
)
=
f
i
j
(
ξ
11
,
ξ
12
,
ξ
13
,
…
,
ξ
m
1
,
…
,
ξ
m
n
)
(
i
=
1
,
2
,
3
…
,
r
;
j
=
1
,
2
,
…
,
s
)
f_{ij}(\mathbf{X})=f_{ij}(\xi_{11},\xi_{12},\xi_{13},\dots,\xi_{m1},\dots,\xi_{mn})(i=1,2,3\dots,r;j=1,2,\dots,s)
fij(X)=fij(ξ11,ξ12,ξ13,…,ξm1,…,ξmn)(i=1,2,3…,r;j=1,2,…,s),定义函数矩阵,
F
(
X
)
=
[
f
11
(
X
)
…
f
1
s
(
X
)
⋮
⋮
⋮
f
r
1
(
X
)
…
f
r
s
(
X
)
]
\mathbf{F}(\mathbf{X})=\begin{bmatrix} {f_{11}(\mathbf{X} )} & \ldots & {f_{1s}(\mathbf{X}}) \\ \vdots & \vdots & \vdots \\ {f_{r1}(\mathbf{X} )} & \dots & {f_{rs}(\mathbf{X})} \end{bmatrix}
F(X)=⎣⎢⎡f11(X)⋮fr1(X)…⋮…f1s(X)⋮frs(X)⎦⎥⎤
对矩阵
X
\mathbf{X}
X的导数为,
d
F
d
X
=
[
∂
F
∂
ξ
11
∂
F
∂
ξ
12
…
∂
F
∂
ξ
1
n
⋮
⋮
⋮
∂
F
∂
ξ
m
1
∂
F
∂
ξ
m
2
…
∂
F
∂
ξ
m
n
]
\dfrac {d\mathbf{F}}{d\mathbf{X}}=\begin{bmatrix} \dfrac {\partial \mathbf{F}}{\partial \xi_{11}} & \dfrac {\partial \mathbf{F}}{\partial \xi_{12}} & \ldots & \dfrac {\partial \mathbf{F}}{\partial \xi_{1n}} \\ \vdots & \vdots & & \vdots \\ \dfrac {\partial \mathbf{F}}{\partial \xi_{m1}} & \dfrac {\partial \mathbf{F}}{\partial \xi_{m2}} & \ldots & \dfrac {\partial \mathbf{F}}{\partial \xi_{mn}} \end{bmatrix}
dXdF=⎣⎢⎢⎢⎢⎡∂ξ11∂F⋮∂ξm1∂F∂ξ12∂F⋮∂ξm2∂F……∂ξ1n∂F⋮∂ξmn∂F⎦⎥⎥⎥⎥⎤
其中,
∂
F
∂
ξ
i
j
=
[
∂
f
11
∂
ξ
i
j
∂
f
12
∂
ξ
i
j
…
∂
f
1
s
∂
ξ
i
j
⋮
⋮
⋮
∂
f
r
1
∂
ξ
i
j
∂
f
r
1
∂
ξ
i
j
…
∂
f
r
s
∂
ξ
i
j
]
\dfrac {\partial \mathbf{F}}{\partial \xi _{ij}}=\begin{bmatrix} \dfrac {\partial f_{11}}{\partial \xi_{ij}} & \dfrac {\partial f_{12}}{\partial \xi_{ij}} & \ldots & \dfrac {\partial f_{1s}}{\partial \xi _{ij}} \\ \vdots & \vdots & & \vdots \\ \dfrac {\partial f_{r1}}{\partial \xi_{ij}} & \dfrac {\partial f_{r1}}{\partial \xi_{ij}} & \ldots & \dfrac {\partial f_{rs}}{\partial \xi_{ij}} \end{bmatrix}
∂ξij∂F=⎣⎢⎢⎢⎢⎢⎡∂ξij∂f11⋮∂ξij∂fr1∂ξij∂f12⋮∂ξij∂fr1……∂ξij∂f1s⋮∂ξij∂frs⎦⎥⎥⎥⎥⎥⎤
例如,
X
=
(
ξ
1
,
ξ
2
,
…
,
ξ
n
)
T
\mathbf{X}=(\xi_1,\xi_2,\dots,\xi_n)^{T}
X=(ξ1,ξ2,…,ξn)T,
n
n
n元函数
f
(
X
)
=
f
(
ξ
1
,
ξ
2
,
…
,
ξ
n
)
f(\mathbf{X})=f(\xi_1,\xi_2,\dots,\xi_n)
f(X)=f(ξ1,ξ2,…,ξn),
则,
d
f
d
X
=
(
d
f
d
ξ
1
,
d
f
d
ξ
2
,
…
,
d
f
ξ
n
)
T
\dfrac{df}{d\mathbf{X}}=(\dfrac{df}{d\xi_1},\dfrac{df}{d\xi_2},\dots,\dfrac{df}{\xi_n})^{T}
dXdf=(dξ1df,dξ2df,…,ξndf)T
因此,
d
d
X
T
(
d
f
d
X
)
=
[
∂
2
f
∂
ξ
1
2
∂
2
f
∂
ξ
1
∂
ξ
2
⋯
∂
2
f
∂
ξ
1
∂
ξ
n
∂
2
f
∂
ξ
2
∂
ξ
1
∂
2
f
∂
ξ
2
2
⋯
∂
2
f
∂
ξ
2
∂
ξ
n
⋮
⋮
⋮
⋮
∂
2
f
∂
ξ
n
∂
ξ
1
∂
2
f
∂
ξ
n
∂
ξ
2
⋯
∂
2
f
∂
ξ
n
2
]
\dfrac {d}{d\mathbf{X}^{T}}\left( \dfrac {df}{d\mathbf{X}}\right) =\begin{bmatrix}\dfrac {\partial ^{2}f}{\partial \xi_{1}^2} & \dfrac {\partial^{2} f}{\partial \xi _{1}\partial \xi _{2}} & \cdots & \dfrac {\partial^{2} f}{\partial \xi _{1}\partial \xi _{n}}\\ \dfrac {\partial ^{2}f}{\partial \xi_{2}\partial \xi_{1}} & \dfrac {\partial^{2} f}{{\partial \xi _{2}}^{2}} & \cdots & \dfrac {\partial^{2} f}{\partial \xi _{2}\partial \xi _{n}}\\ \vdots &\vdots & \vdots& \vdots\\ \dfrac {\partial ^{2}f}{\partial \xi_{n}\partial \xi_{1}} & \dfrac {\partial^{2} f}{\partial \xi_{n}\partial \xi _{2}} & \cdots & \dfrac {\partial^{2} f}{\partial {\xi _{n}}^{2}} \end{bmatrix}
dXTd(dXdf)=⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡∂ξ12∂2f∂ξ2∂ξ1∂2f⋮∂ξn∂ξ1∂2f∂ξ1∂ξ2∂2f∂ξ22∂2f⋮∂ξn∂ξ2∂2f⋯⋯⋮⋯∂ξ1∂ξn∂2f∂ξ2∂ξn∂2f⋮∂ξn2∂2f⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤