[ML]FFCSC数学原理与算法简述(与代码对应)

本文档对FFCSC_raw代码的核心算法进行简单叙述, 推导过程省略, 符号均与代码中的变量名相匹配.

目标函数

\[\mathop{\text{arg min}}\limits_{d,z}\frac{\lambda_{r}}{2}||b-Dz||_2^2+\lambda_{p}||z||_1+ind_C(d) \]

使用 ADMM 方法进行优化, 对滤波器 d 和特征图 z 交替进行.

滤波器 d 的优化

关于滤波器的优化问题可以表示为

\[\mathop{\text{min}}\limits_d\frac{\lambda_{r}}{2}||b-MZd||_2^2+ind_C(d) \]

\[\left[ \begin{array}{} v^{(1)}\\ v^{(2)} \end{array} \right] = \left[ \begin{array}{} Zd\\d \end{array} \right] \]

\[\left[\begin{array}{}u_k^{(1)}\\u_k^{(2)}\end{array}\right]=\left[\begin{array}{}v_{k+1}^{(1)}\\v_{k+1}^{(2)}\end{array}\right] \]

可转化为增广拉格朗日问题

\[\begin{align} L_\gamma(d,v^{(\cdot)},d^{(\cdot)})=&\frac{\lambda_{r}}{2}||b-Mv^{(1)}||_2^2+\gamma_1[d^{(1)}]^T(v^{(1)}-Zd)+\frac{\gamma_1}{2}||v^{(1)}-Zd||_2^2+...\\ &ind_C(v^{(2)})+\gamma_2[d^{(2)}]^T(v^{(2)}-d)+\frac{\gamma_2}{2}||v^{(2)}-d||_2^2 \end{align} \]

上述增广拉格朗日问题可划分为若干个子问题:

\[d=\left[Z^TZ+\frac{\gamma_2}{\gamma_1}I\right]^{-1}\left[Z^T(v^{(1)}+d^{(1)})+\frac{\gamma_2}{\gamma_1}(v^{(2)}+d^{(2)})\right] \]

\[u^{(1)}=\left[M^TM+\frac{\gamma_1}{\lambda_r}I\right]^{-1}\left[M^Tb+\frac{\gamma_1}{\lambda_r}(v^{(1)}-d^{(1)})\right] \]

\[u^{(2)}=\begin{cases}\frac{v^{(2)}-d^{(2)}}{||v^{(2)}-d^{(2)}||_2}\ &:||v^{(2)}-d^{(2)}||_2^2\ge1\\v^{(2)}-d^{(2)}\ &:\text{else}\end{cases} \]

\[d^{(1)}=d^{(1)}+u^{(1)}-v^{(1)}\\ d^{(2)}=d^{(2)}+u^{(2)}-v^{(2)} \]

其中d的更新可以通过矩阵逆引理简化为

\[d=\underbrace{\frac{\gamma_1}{\gamma_2}\left[I-Z^T(\frac{\gamma_2}{\gamma_1}I+ZZ^T)^{-1}Z\right]}_{\text{Matrix inversion lemma}}\left[Z^T\underbrace{(v^{(1)}+d^{(1)})}_{x^{(1)}}+\frac{\gamma_2}{\gamma_1}\underbrace{(v^{(2)}+d^{(2)})}_{x^{(2)}}\right] \]

特征图 z 的优化

关于特征图的优化问题可以表示为

\[\mathop{\text{min}}\limits_z\frac{\lambda_r}{2}||b-MDz||_2^2+\lambda_p||z||_1 \]

\[\left[ \begin{array}{} v^{(1)}\\ v^{(2)} \end{array} \right]= \left[ \begin{array}{} Dz\\z \end{array} \right] \]

\[\left[\begin{array}{}u_k^{(1)}\\u_k^{(2)}\end{array}\right]=\left[\begin{array}{}v_{k+1}^{(1)}\\v_{k+1}^{(2)}\end{array}\right] \]

可转化为增广拉格朗日问题

\[L_\gamma(z,v^{(\cdot)},d^{(\cdot)})=\frac{\lambda_r}{2}||b-MDz||_2^2+\gamma_1[d^{(1)}]^T(v^{(1)}-Dz)+\frac{\gamma_1}{2}||v^{(1)}-Dz||_2^2+...\\ \lambda_p||v^{(2)}||_1+\gamma_2[d^{(2)}]^T(u^{(2)}-z)+\frac{\gamma_2}{2}||u^{(2)}-z||_2^2 \]

上述增广拉格朗日问题可划分为若干个子问题:

\[z=\left[D^TD+\frac{\gamma_2}{\gamma_1}I\right]^{-1}\left[D^T(v^{(1)}+d^{(1)})+\frac{\gamma_2}{\gamma_1}(v^{(2)}+d^{(2)})\right] \]

\[u^{(1)}=\left[M^TM+\frac{\gamma_1}{\lambda_r}I\right]^{-1}\left[M^Tb+\frac{\gamma_1}{\lambda_r}(v^{(1)}-d^{(1)})\right] \]

\[u^{(2)}=\text{max}\left(1-\frac{\gamma_2/\lambda_p}{|v^{(2)}-d^{(2)}|},0\right)\odot (v^{(2)}-d^{(2)}) \]

\[d^{(1)}=d^{(1)}+u^{(1)}-v^{(1)}\\ d^{(2)}=d^{(2)}+u^{(2)}-v^{(2)} \]

其中z的更新可以通过矩阵逆引理简化为

\[z=\underbrace{\frac{\gamma_1}{\gamma_2}\left[I-D^T(\frac{\gamma_2}{\gamma_1}I+DD^T)^{-1}D\right]}_{\text{Matrix inversion lemma}}\left[D^T\underbrace{(v^{(1)}+d^{(1)})}_{x^{(1)}}+\frac{\gamma_2}{\gamma_1}\underbrace{(v^{(2)}+d^{(2)})}_{x^{(2)}}\right] \]

近端算子

Quadratic prox for masked data:

\[\textbf{prox}_{f/\lambda}(u)=(M^TM+\lambda\mathbb{I})^{-1}(M^Tb+\lambda u) \]

Shrinkage prox for sparsity:

\[\textbf{prox}_{f/\lambda}(u)=\text{max}(1-\frac{1}{\lambda|u|},0)\odot u \]

Projection prox for kernel constrains:

\[\textbf{prox}_{f/\lambda}(u)=\begin{cases}\frac{u}{||u||_2}\ &:||u||_2^2\ge1\\u\ &:\text{else}\end{cases} \]

上一篇:强化学习 7——Deep Q-Learning(DQN)公式推导


下一篇:强化学习-SARSA(lambda)路径规划