Navigator
Interpolation
Question: When does there exist a convex function
f
:
R
k
→
R
f:\mathbb{R}^k\to \mathbb{R}
f:Rk→R with
d
o
m
f
=
R
k
dom f=\mathbb{R}^k
domf=Rk, that satisfies the interpolation conditions
f
(
u
i
)
=
y
i
,
i
=
1
,
…
,
m
f(u_i)=y_i, i=1, \dots, m
f(ui)=yi,i=1,…,m
at given points
u
i
∈
R
k
u_i\in\mathbb{R}^k
ui∈Rk?
Answer: If and only if there exist
g
1
,
g
2
,
…
,
g
m
g_1, g_2, \dots, g_m
g1,g2,…,gm such that
y
j
≥
y
i
+
g
i
T
(
u
j
−
u
i
)
,
i
,
j
=
1
,
…
,
m
(1)
y_j\geq y_i+g_i^T(u_j-u_i), \quad i,j=1, \dots, m\tag{1}
yj≥yi+giT(uj−ui),i,j=1,…,m(1)
At each
u
i
u_i
ui we can find a vector
g
i
g_i
gi such that
f
(
z
)
≥
f
(
u
i
)
+
g
i
T
(
z
−
u
i
)
(2)
f(z)\geq f(u_i)+g_i^T(z-u_i)\tag{2}
f(z)≥f(ui)+giT(z−ui)(2)
for all
z
z
z. If
f
f
f is differentiable, we can take
g
i
=
∇
f
(
u
i
)
g_i=\nabla f(u_i)
gi=∇f(ui) by finding a supporting hyperplane to
e
p
i
f
epi f
epif at
(
u
i
,
y
i
)
(u_i, y_i)
(ui,yi) (
g
i
g_i
gi are called subgradients). By applying
(
2
)
(2)
(2) to
z
=
u
j
z=u_j
z=uj, we obtain
(
1
)
(1)
(1).
Conversely, suppose
g
1
,
g
2
,
…
,
g
m
g_1, g_2, \dots, g_m
g1,g2,…,gm satisfy
(
1
)
(1)
(1). Define
f
f
f as
f
(
z
)
=
max
i
=
1
,
…
,
m
(
y
i
+
g
i
T
(
z
−
u
i
)
)
f(z)=\max_{i=1, \dots, m}(y_i+g_i^T(z-u_i))
f(z)=i=1,…,mmax(yi+giT(z−ui))
Fitting a convex function to given data
min
∑
i
=
1
m
(
y
i
−
f
(
u
i
)
)
2
s
.
t
.
{
f
:
R
k
→
R
d
o
m
f
=
R
k
\min\sum_{i=1}^m (y_i-f(u_i))^2\\ s.t. \begin{cases} f:\mathbb{R}^k\to\mathbb{R} \\ dom f=\mathbb{R}^k \end{cases}
mini=1∑m(yi−f(ui))2s.t.{f:Rk→Rdomf=Rk
This is an infinite-dimensional problem, since the variable is
f
f
f, which is in the space of continous real-valued functions on
R
k
\mathbb{R}^k
Rk. According to the result abvove, we can formulate this problem as
min
∑
i
=
1
m
(
y
i
−
y
^
)
2
s
.
t
.
y
^
j
≥
y
^
i
+
g
i
T
(
u
j
−
u
i
)
\min \sum_{i=1}^m(y_i-\hat{y})^2\\ s.t.\quad \hat{y}_j\geq \hat{y}_i+g_i^T(u_j-u_i)
mini=1∑m(yi−y^)2s.t.y^j≥y^i+giT(uj−ui)
cvx/yalmip code
%% data
rng(29);
nos = 0.05;
u = [0:0.04:2]';
m = length(u);
y = 5*(u-1).^4+0.6*(u-1).^2+0.5*u;
v1=u>=0.4;
v2=u<=0.6;
v3 = v1.*v2;
dipvec = ((v3.*u-0.4*ones(size(v3)).^2)).*v3;
y = y+40*(dipvec-(0.4)^2*v3);
%% add perturbation
randf = nos*(rand(m, 1)-0.5);
yns = y+norm(y)*randf;
%% cvx: fitting
cvx_begin
variables yhat(m) g(m)
minimize(norm(yns-yhat))
subject to
yhat*ones(1, m)>= ones(m, 1)*yhat'+(ones(m, 1)*g').*(u*ones(1, m)-ones(m, 1)*u');
cvx_end
%% yalmip
y_yal = sdpvar(m, 1);
g = sdpvar(m, 1);
F = [y_yal*ones(1,m)>=ones(m, 1)*y_yal'+(ones(m, 1)*g').*(u*ones(1, m)-ones(m, 1)*u')];
obj = norm(yns-y_yal);
optimize(F, obj);
value(y_yal)
%%
nobs = 1000;
t = linspace(0, 2, nobs);
f = max(yhat(:, ones(1, nobs))+g(:, ones(1, nobs)).*(t(ones(m, 1),:)-u(:, ones(1, nobs))));
%%
figure;
hold on;
plot(u, yns, 'o');
plot(t, f, '-');
set(gca,'xtick',[],'ytick',[],'xcolor','w','ycolor','w');
Penalty function approx.
The log barrier penalty function is as follows:
ϕ
(
u
)
=
{
−
a
2
log
(
1
−
(
u
/
a
)
2
)
∣
u
∣
<
a
∞
∣
u
∣
≥
a
\phi(u)= \begin{cases} -a^2\log(1-(u/a)^2)&|u|<a\\ \infty & |u|\geq a \end{cases}
ϕ(u)={−a2log(1−(u/a)2)∞∣u∣<a∣u∣≥a
Comparing the
e
l
l
1
,
e
l
l
2
ell_1, ell_2
ell1,ell2 and deadzone-linear and log-barrier penalty functions for the approximation problem:
min
ϕ
(
A
x
−
b
)
\min \phi(Ax-b)
minϕ(Ax−b)
log barrier penalty
The log barrier penalty
function puts weight very much like the
l
2
l_2
l2-norm penalty for small residuals, but puts very strong weight on residuals larger than
y
1
y_1
y1, and infinite weight on residuals larger than
y
2
y_2
y2.
%%
rng(729);
m = 100;
n = 30;
A = randn(m, n);
b = randn(m, 1);
%% ell_1 approx.
disp('ell 1 approximation');
cvx_begin
variable x1(n)
minimize (norm(A*x1+b, 1))
cvx_end
%% ell_2 approx.
disp('ell 2');
x2 = -A\b;
%% deadzone
disp('deadzone penalty');
dz = 0.5;
cvx_begin
variable xdz(n)
minimize(sum(max(abs(A*xdz+b)-dz, 0)));
cvx_end
%% log-barrier penalty approximation
disp('log-barrier');
% parameters for Newton method & Line search
alpha = 0.01;
beta = 0.5;
% min infty norm to get starting point
cvx_begin
variable xlb(n)
minimize norm(A*xlb+b, Inf)
cvx_end
linf = cvx_optval;
A = A/(1.1*linf);
b = b/(1.1*linf);
for iters = 1:50
yp=1-(A*xlb+b);
ym=(A*xlb+b)+1;
f = -sum(log(yp))-sum(log(ym));
g = A'*(1./yp)-A'*(1./ym);
H = A'*diag(1./(yp.^2)+1./(ym.^2))*A;
v = -H\g;
fprime = g'*v;
ntdecr = sqrt(-fprime);
if (ntdecr<1e-5)
break;
end
t=1;
newx = xlb+t*v;
while (min(1-(A*newx+b))<0 | min(A*newx+b+1)<0)
t = beta*t;
newx = xlb+t*v;
end
newf = -sum(log(1-(A*newx+b)))-sum(log(1+(A*newx+b)));
while (newf>f+alpha*t*fprime)
t = beta*t;
newx = xlb+t*v;
newf = -sum(log(1-(A*newx+b)))-sum(log(1+(A*newx+b)));
end
xlb = xlb+t*v;
end
%% plot histogram of residuals
ss = max(abs([A*x1+b; A*x2+b; A*xdz+b; A*xlb+b]));
tt = -ceil(ss):0.05:ceil(ss);
[N1, hist1] = hist(A*x1+b, tt);
[N2, hist2] = hist(A*x2+b, tt);
[N3, hist3] = hist(A*xdz+b, tt);
[N4, hist4] = hist(A*xlb+b, tt);
%%
range_max=2.0; rr=-range_max:1e-2:range_max;
figure(1), clf, hold off
subplot(4,1,1),
bar(hist1,N1);
hold on
plot(rr, abs(rr)*40/3, '-');
ylabel('p=1')
axis([-range_max range_max 0 40]);
hold off
subplot(4,1,2),
bar(hist2,N2);
hold on;
plot(rr,2*rr.^2),
ylabel('p=2')
axis([-range_max range_max 0 11]);
hold off
subplot(4,1,3),
bar(hist3,N3);
hold on
plot(rr,30/3*max(0,abs(rr)-dz))
ylabel('Deadzone')
axis([-range_max range_max 0 25]);
hold off
subplot(4,1,4),
bar(hist4,N4);
rr_lb=linspace(-1+(1e-6),1-(1e-6),600);
hold on
plot(rr_lb, -3*log(1-rr_lb.^2),rr,2*rr.^2,'--')
axis([-range_max range_max 0 11]);
ylabel('Log barrier'),
xlabel('r')
hold off
Reference
Convex Optimization S.Boyd Page 352