Lipschitz constraint in deep learning

1. “稳健”模型:满足L约束

(1)对于参数扰动的稳定性

如模型Lipschitz constraint in deep learningLipschitz constraint in deep learning是否有相近的效果。

(2)对于输入扰动的稳定性

Lipschitz constraint in deep learningLipschitz constraint in deep learning是否有相近的效果。

2. L约束:

Lipschitz constraint in deep learningLipschitz constraint in deep learning

存在某个常数C(与参数有关,与输入无关),使下式恒成立

Lipschitz constraint in deep learning

其中,Lipschitz constraint in deep learning越小越好,意味着对输入扰动越不敏感。

3. 神经网络中的L约束:

单层全连接Lipschitz constraint in deep learningLipschitz constraint in deep learning为激活函数,Lipschitz constraint in deep learning为参数矩阵(向量),则

Lipschitz constraint in deep learning

Lipschitz constraint in deep learning充分接近,则

Lipschitz constraint in deep learning

由于现有激活函数如sigmoid,relu等满足“导数有上下界”,则Lipschitz constraint in deep learning(每个元素)的绝对值都不超过某个常数,则

Lipschitz constraint in deep learning

希望C尽可能小,从而给参数带来一个正则化项Lipschitz constraint in deep learning

4. 矩阵范数:

F范数(Frobenius Norm):(又称L2范数)——deep中常用的L2正则化就是这种。

Lipschitz constraint in deep learning

通过柯西不等式,有Lipschitz constraint in deep learning

谱范数(Spectral Norm):(又称2范数或谱半径)

Lipschitz constraint in deep learningLipschitz constraint in deep learningLipschitz constraint in deep learning(Hermite矩阵)的最大特征值

谱范数Lipschitz constraint in deep learning等于Lipschitz constraint in deep learning的最大特征根(主特征根)的平方根,若Lipschitz constraint in deep learning是方阵,则Lipschitz constraint in deep learning等于W的最大的特征根的绝对值。

则:

Lipschitz constraint in deep learning

Lipschitz constraint in deep learningLipschitz constraint in deep learning提供了一个上界,Lipschitz constraint in deep learning是最准确的C,如果不太关心精准度,则C取Lipschitz constraint in deep learning也可以。

5. L2正则项:(L2正则化与F范数的关系)

由于谱范数暂时没有计算出来,则先计算一个更大的上界Lipschitz constraint in deep learning,此时神经网络的loss为

Lipschitz constraint in deep learning

表明L2正则化使模型更好地满足L约束,降低模型对输入扰动的敏感性,增强模型的泛化性能。

6. 幂迭代求谱范数:Lipschitz constraint in deep learning  ->求Lipschitz constraint in deep learning的最大特征根。

特征根求法:Lipschitz constraint in deep learningLipschitz constraint in deep learningLipschitz constraint in deep learning

Lipschitz constraint in deep learning,迭代若干次,得Lipschitz constraint in deep learning

等价于

Lipschitz constraint in deep learningLipschitz constraint in deep learningLipschitz constraint in deep learning

即初始化u,v之后,迭代若干次得到u,v,然后带入计算得到Lipschitz constraint in deep learning的近似值。

7. 谱正则化(Spectral Norm Regularization):

F范数是一个更粗糙的条件,更准确的范数应该为谱范数。则神经网络的loss为

Lipschitz constraint in deep learning

PyTorch计算谱范数代码(待续)

8. 梯度惩罚:只在局部空间生效???

Lipschitz constraint in deep learning,此时C=1,且Lipschitz constraint in deep learning是前述不等式的充分条件,把该项加入到网络的loss中作为惩罚项,即

Lipschitz constraint in deep learning

9. 谱归一化(Spectral Normalization):Lipschitz constraint in deep learning中所有的参数替换为Lipschitz constraint in deep learning

梯度惩罚的每个epoch的运行时间比谱归一化要长。

Reference:

https://spaces.ac.cn/archives/6051

《Spectral Norm Regularization for Improving the Generalizability of Deep Learning》

《Spectral Normalization for Generative Adversarial Networks》ICLR2018

上一篇:操作数据表


下一篇:Constraint Optimization