Huber损失最小化学习法

Huber regression
In least square learning methods, we make use of 2 loss to make sure that we get a suitable outcome. However, in the robust point of view, it is always better to make use of the least absolute as the main criterion, i.e.

θ^LA=argminθJLA(θ),JLA(θ)=i=1n|ri|

where ri=fθ(xi)yi is the residual error. By doing so, it is possible to make the learning method more robust at the cost of accuracy.
In order to balance robustness and accuracy, Huber loss may be a good alternative:
ρHuber(r)={ r2/2η|r|η2/2(|r|η)(|r|>η)

Then the optimization goal turns out to be:
minθJ(θ),J(θ)=i=1nρHuber(ri)

As usual, take the linear parameterized model as an example:
fθ(x)=j=1bθjϕj(x)=θTϕ(x)

For simplicity, we omit the details and give the final outcome (more details needed? refer to 1 constrained LS):
θ^=argminθJ~(θ),J~(θ)=12i=1nω~iri+C

where ω~i={1η/|r~i|(|r~i|η)(|r~i|>η) and C=i:|r~i|>η(η|r~i|/2η2/2) are independent of θ.
Therefore, the solution can be formulated as:
θ^=(ΦTW~Φ)ΦTW~y
where W~=diag(ω~1,,ωn).
By iteration, we can solve θ^ as an estimation of θ. The corresponding MATLAB codes are given below:
n=50; N=1000;
x=linspace(-3,3,n)'; X=linspace(-4,4,N)';
y=x+0.2*randn(n,1); y(n)=-4;

p(:,1)=ones(n,1); p(:,2)=x; t0=p\y; e=1;
for o=1:1000
    r=abs(p*t0-y); w=ones(n,1); w(r>e)=e./r(r>e);
    t=(p'*(repmat(w,1,2).*p))\(p'*(w.*y));
    if norm(t-t0)<0.001, break, end
    t0=t;
end
P(:,1)=ones(N,1); P(:,2)=X; F=P*t;

figure(1); clf; hold on; axis([-4,4,-4.5,3.5]);
plot(X,F,'g-'); plot(x,y,'bo');

Huber损失最小化学习法

Tukey regression
The Huber loss combined 1 loss and 2 loss to balance robustness and accuracy. Since 1 loss is concerned, the outliers may have an enormous impact on the final outcome. To tackling that, Tukey may be a considerable alternative:

ρTukey(r)=(1[1r2/η2]3)η2/6η2/6(|r|η)(|r|>η)

Of course, the Tukey loss is not a convex funciton, that is to say, there may be serveral local optimal solution. In actual applications, we apply the following weights:
ω={(1r2/η2)20(|r|η)(|r|>η)

Hence the outliers can no longer put any impact on our estimation.
上一篇:Flash/Flex学习笔记(7):FMS3.5基于IIS的安装


下一篇:Android学习笔记——Intents 和 Intent Filters(一)