l2约束的最小二乘学习法

2constrained least squares
In the simple least squares, noisy samples may lead to overfitting learning output. Therefore, it is rational to constrain the space of parameters.
We will focus on the simplest case - 2 constrained least squares in this note, i.e.

minθJLS(θ),s.t.θ2R

In order to the solve the aforesaid optimal problem, using Lagrangian dual problem, we can utilize the following optimal problem:
maxλminθ[JLS(θ)+λ2(θ2R)]

A brief review of Lagrangian dual problem can be found in the bottom of the note.
However, it is not necessary to define a R to constrained λ, then we can solve the estimated θ as
θ^=argminθ[JLS(θ)+λ2θ2]

where the first term JLS(θ) represents the fitting level which is combined by λ2θ2 to prevent overfitting to some degree.
Taking the partial difference of [JLS(θ)+λ2θ2 and seting it to be zero, we get the solution
θ^=(ΦTΦ+λI)1ΦTy

A more general method involves a regularizer G:
minθJLS(θ)s.t.θTGθR
and
θ^=(ΦTΦ+λG)1ΦTy

Ex: The Guassian Kernal Model

fθ(x)=j=1nθjK(x,xj),K(x,c)=exp(xc22h2)
The MATLAB codes go as follows:
n=50; N=1000;
x=linspace(-3,3,n)'; X=linspace(-3,3,n)';
pix=pi*x; y=sin(pix)./(pix)+0.1*x+0.2*randn(n,1);

x2=x.^2; X2=X.^2; hh=2*0.3^2; l=0.1;
k=exp(-(repmat(x2,1,n)+repmat(x2',n,1)-2*x*x')/hh);
K=exp(-(repmat(X2,1,n)+repmat(x2',N,1)-2*X*x')/hh);
t1=k\y; F1=K*t1; t2=(k^2+l*eye(n))\(k*y); F2=K*t2;

figure(1); clf; hold on; axis([-2.8,2.8,-1,1.5]);
plot(X,F1,'g-');plot(X,F2,'r--');plot(x,y,'bo');
legend('LS','L2-Constrained LS');

Appendix Lagrangian Dual Problem
Given differentiable convex function f:RdR and g:RdRp, a optimal problem can be formulated as

mintf(t),s.t.g(t)0

Let
λ=(λ1,,λp)T
be the Lagrangian multiplier.
Let
L(t,λ)=f(t)+λTg(t)
be the Lagrangian function.
Then the aforementioned optimal problem can be defined as
maxλinftL(t,λ),s.t.λ0

That’s the Lagrangian dual problem. We can get the same t by solving it.
上一篇:算法面试真题详解:下一个排列


下一篇:《深入理解计算机系统》使用指南