ℓ1Constrained Least Squares
In sparse learning, ℓ1 constrained LS, also known as Lasso Regression, is a common learning method:
minθJLS(θ)s.t.∥θ∥1≤R
where ∥θ∥1=∑j=1b|θj|
Generally speaking, the solution of an ℓ1 constrained LS is located on the axis, that is to say, there are several parameters θj equal to zero (sparse).
Then how to solve it? Given the indifferentiable property of the absolute value at the origin, solving an ℓ1 constrained LS is not so easy as solving the ℓ2 constrained one. However, we can still apply Lagrange multiplier.
minθJ(θ),J(θ)=JLS(θ)+λ∥θ∥1
Note that
|θj|≤θ2j2cj+cj2,cj>0
i.e. we can optimize the upper-bound of J(θ). By iteration, we take the current solution θ~j≠0 as cj so as to formulate the upper bound constraint: |θj|≤θ2j2|θ~j|+|θ~j|2
If θ~j=0, we should take θj=0. When we use general inverse, the inequality above can be referred as: |θj|≤|θ~j|†2θ2j+|θ~j|2
Therefore, we can get the following ℓ2 regularized constrained LS problem formulation:
θ^=argminθJ~(θ),J~(θ)=JLS(θ)+λ2θTΘ~†θ+C
where Θ~=⎛⎝⎜⎜⎜|θ~1|⋱|θ~b|⎞⎠⎟⎟⎟ and C=∑bj=1|θ~j|/2 are independent of θ.
Take the parameterized linear model for example
fθ(x)=θTϕ(x)
Then, by the use of Lagrange multiplier, we can get
θ^=(ΦTΦ+λΘ~†)−1Φy
Renew the estimation θ~ as θ~=θ^, go back to calculate the new θ^ until θ^ comes to the required precision.
For simplicity, the whole algorithm goes as follows:
- Initialize θ0 and i=1.
- Calculate Θi using θi−1.
- Estimate θi using Θi.
- i=i+1, go back to step 2.
An Example:
n=50; N=1000;
x=linspace(-3,3,n)'; X=linspace(-3,3,N)';
pix=pi*x;
y=sin(pix)./(pix)+0.1*x+0.2*rand(n,1);
hh=2*0.3^2; l=0.1; t0=randn(n,1); x2=x.^2;
k=exp(-(repmat(x2,1,n)+repmat(x2',n,1)-2*(x*x'))/hh);
k2=k^2; ky=k*y;
for o=1:1000
t=(k2+l*pinv(diag(abs(t0))))\ky;
if norm(t-t0)<0.001, break, end
t0=t;
end
K=exp(-(repmat(X.^2,1,n)+repmat(x2',N,1)-2*X*x')/hh);
F=K*t;
figure(1); clf; hold on; axis([-2.8,2.8,-1,1.5]);
plot(X,F,'g-'); plot(x,y,'bp');