Navigator
- Probability distribution
- Norm ball constraint
- Robust approximation
- Worst-case robust approximation
- Finite Set
- Reference
Probability distribution
We can impose the constraint that
x
x
x satisfy
x
⪰
0
,
1
T
x
=
1
x\succeq 0, \mathbf{1}^Tx=1
x⪰0,1Tx=1
min
∥
A
x
−
b
∥
s
.
t
.
{
x
⪰
0
1
T
x
=
1
\min \lVert Ax-b\rVert\\ s.t. \begin{cases} x\succeq 0\\ \mathbf{1}^Tx=1 \end{cases}
min∥Ax−b∥s.t.{x⪰01Tx=1
Norm ball constraint
We can add to the basic norm approximation problem the constraint that
x
x
x lie in a norm ball
min
∥
A
x
−
b
∥
s
.
t
.
∥
x
−
x
0
∥
≤
d
\min \lVert Ax-b\rVert\\ s.t.\quad \lVert x-x_0\rVert\leq d
min∥Ax−b∥s.t.∥x−x0∥≤d
where
x
0
x_0
x0 is prior guess of what the parameter
x
x
x is, and
d
d
d is the maximum plausible deviation of our estimation from our prior guess. And, the constraint
∥
x
−
x
0
∥
≤
d
\lVert x-x_0\rVert\leq d
∥x−x0∥≤d denote a trust region.
Robust approximation
We assume that
A
A
A is a random variable taking values in
R
m
×
n
\mathbb{R}^{m\times n}
Rm×n, with mean
A
ˉ
\bar{A}
Aˉ, so we can describe
A
A
A as
A
=
A
ˉ
+
U
A=\bar{A}+U
A=Aˉ+U
where
U
U
U is a random matrix with zero mean. Here, the constant matrix
A
ˉ
\bar{A}
Aˉ gives the average value of
A
A
A, and
U
U
U describes its statistical variation.
It is natural to use the expected value of
∥
A
x
−
b
∥
\lVert Ax-b\rVert
∥Ax−b∥ as the objective:
min
E
∥
A
x
−
b
∥
(1)
\min \mathbb{E}\lVert Ax-b\rVert\tag{1}
minE∥Ax−b∥(1)
We refer to this problem as the stochastic robust approximation problem. It is always a convex optimization problem, but usually not tractable since in most cases it is very difficult to evalute the objective or its derivatives.
When
A
A
A assumes only a finite number of values, i.e.,
p
r
o
b
(
A
=
A
i
)
=
p
i
,
i
=
1
,
…
,
k
prob(A=A_i)=p_i, \quad i=1, \dots, k
prob(A=Ai)=pi,i=1,…,k
where
A
i
∈
R
m
×
n
,
1
T
p
=
1
,
p
⪰
0
A_i\in\mathbb{R}^{m\times n}, \mathbf{1}^Tp=1, p\succeq 0
Ai∈Rm×n,1Tp=1,p⪰0. And the problem has the form
min
p
1
∥
A
1
x
−
b
∥
+
⋯
+
p
k
∥
A
k
x
−
b
∥
\min\quad p_1\lVert A_1x-b\rVert+\dots+p_k\lVert A_kx-b\rVert
minp1∥A1x−b∥+⋯+pk∥Akx−b∥
which is often called a sum-of-norms problem. It can be expressed as
min
p
T
t
s
.
t
.
∥
A
x
i
−
b
∥
≤
t
i
,
i
=
1
,
…
,
k
\min \quad p^Tt\\ s.t.\quad \lVert Ax_i-b\rVert\leq t_i, \quad i=1, \dots, k
minpTts.t.∥Axi−b∥≤ti,i=1,…,k
If the norm is the Euclidean norm, this sum-of-norms problem is an SOCP. If the norm is the
l
1
l_1
l1 or
l
∞
l_\infty
l∞ norm, the sum-of-norms problem can be expressed as an LP.
Example: Consider the statistical robust least-squares problem
min
E
∥
A
x
−
b
∥
2
2
\min\quad \mathbb{E}\lVert Ax-b\rVert_2^2
minE∥Ax−b∥22
where the norm is the Euclidean norm. We can express the objective as
E
∥
A
x
−
b
∥
2
2
=
E
(
A
ˉ
x
−
b
+
U
x
)
T
(
A
ˉ
x
−
b
+
U
x
)
=
(
A
ˉ
x
−
b
)
T
(
A
ˉ
x
−
b
)
+
E
x
T
U
T
U
x
=
∥
A
ˉ
x
−
b
∥
2
2
+
x
T
P
x
\begin{aligned} \mathbb{E}\lVert Ax-b\rVert_2^2&=\mathbb{E}(\bar{A}x-b+Ux)^T(\bar{A}x-b+Ux)\\ &=(\bar{A}x-b)^T(\bar{A}x-b)+\mathbb{E}x^TU^TUx\\ &=\lVert\bar{A}x-b\rVert_2^2+x^TPx \end{aligned}
E∥Ax−b∥22=E(Aˉx−b+Ux)T(Aˉx−b+Ux)=(Aˉx−b)T(Aˉx−b)+ExTUTUx=∥Aˉx−b∥22+xTPx
where
P
=
E
U
T
U
P=\mathbb{E}U^TU
P=EUTU. Therefore the statistical robust approximation problem has the form of a regularized least-squares problem
min
∥
A
ˉ
x
−
b
∥
2
2
+
∥
P
1
/
2
x
∥
2
2
\min \lVert\bar{A}x-b\rVert_2^2+\lVert P^{1/2}x\rVert_2^2
min∥Aˉx−b∥22+∥P1/2x∥22
with solution
x
=
(
A
ˉ
T
A
ˉ
+
P
)
−
1
A
ˉ
T
b
x=(\bar{A}^T\bar{A}+P)^{-1}\bar{A}^Tb
x=(AˉTAˉ+P)−1AˉTb
Worst-case robust approximation
It is also possible to model the variation in the matrix
A
A
A using a set-based, worst case approach. We describe the uncertainty by a set of possible values for
A
A
A:
A
∈
A
⊆
R
m
×
n
A\in\mathcal{A}\subseteq\mathbb{R}^{m\times n}
A∈A⊆Rm×n
which assumed to be nonempty and bounded. The associated worst-case error of a candidate approximate solution
x
∈
R
n
x\in\mathbb{R}^n
x∈Rn as
e
w
c
(
x
)
=
sup
{
∥
A
x
−
b
∥
∣
A
∈
A
}
e_{wc}(x)=\sup\{\lVert Ax-b\rVert\mid A\in\mathcal{A}\}
ewc(x)=sup{∥Ax−b∥∣A∈A}
which is always a convex function of
x
x
x. However, the tractability dependes on the norm used and the description of the uncertainty set
A
\mathcal{A}
A.
Comparsion of stochastic and worst-case robust approximation
Consider the least-squares problem
min
∥
A
(
u
)
x
−
b
∥
2
2
\min\lVert A(u)x-b\rVert_2^2
min∥A(u)x−b∥22
where
u
∈
R
u\in\mathbb{R}
u∈R is an uncertain parameter and
A
(
u
)
=
A
0
+
u
A
1
=
A
+
t
B
A(u)=A_0+uA_1=A+tB
A(u)=A0+uA1=A+tB.
Consider the three least-squares problem of
∥
(
A
+
t
B
)
x
−
b
∥
2
\lVert (A+tB)x-b\rVert_2
∥(A+tB)x−b∥2:
where
t
t
t is an uncertain parameter in
[
−
1
,
1
]
[-1, 1]
[−1,1]
- nominal optimal, t = 0 t=0 t=0
- stochastic robust approximation, assuming
t
t
t is uniformly distributed on
[
−
1
,
1
]
[-1, 1]
[−1,1].
min E ∥ ( A + t B ) x − b ∥ 2 = ∥ A ∗ x − b ∥ 2 + x T P x \min \mathbb{E}\lVert (A+tB)x-b\rVert^2=\lVert A*x-b\rVert^2+x^TPx minE∥(A+tB)x−b∥2=∥A∗x−b∥2+xTPx
where P = E t 2 B T B = ( 1 / 3 ) B T B P=\mathbb{E}t^2B^TB=(1/3)B^TB P=Et2BTB=(1/3)BTB - worst case robust approximation
sup − 1 ≤ t ≤ 1 ∥ A ( u ) x − b ∥ 2 = max { ∥ ( A − B ) x − b ∥ 2 , ∥ ( A + B ) x − b ∥ 2 } \sup_{-1\leq t\leq 1}\lVert A(u)x-b\rVert_2=\max\{\lVert(A-B)x-b\rVert_2, \lVert(A+B)x-b\Vert_2\} −1≤t≤1sup∥A(u)x−b∥2=max{∥(A−B)x−b∥2,∥(A+B)x−b∥2}
In an optimal design setting, the variation can represent uncertainty of the linear equations that relate the design variables x x x to the results vector A x Ax Ax.
cvx code
%% comparison of stochastic and worst-case robust
rng(0);
m = 20;
n = 10;
A = randn(m, n);
[U, S, V]=svd(A);
S = diag(logspace(-1, 1, n));
A = U(:, 1:n)*S*V';
B = randn(m, n);
B = B/norm(B);
b = randn(m, 1);
%% cvx: nominal optimal solution
fprintf(1, '(1). the nominal problem ... \n');
cvx_begin quiet
variable x_nom(n)
minimize (norm(A*x_nom-b))
cvx_end
%% cvx: stochastic robust approximation
fprintf(1, '(2). the stochastic robust approximation problem ... \n');
P = (1/3)*(B'*B);
cvx_begin quiet
variable x_stoch(n)
minimize (square_pos(norm(A*x_stoch - b)) + quad_form(x_stoch, P))
cvx_end
%% cvx: worst-case robust approximation
fprintf(1, '(3). the worst case robust approximation problem ... \n');
cvx_begin quiet
variable x_wc(n)
minimize (max(norm((A-B)*x_wc-b), norm((A+B)*x_wc-b)))
cvx_end
%% plot residuals
nobs = 100;
parvals = linspace(-2, 2, nobs);
eps_nom = zeros(1, nobs);
eps_stoch = zeros(1, nobs);
eps_wc = zeros(1, nobs);
for k=1:nobs
eps_nom(k) = norm((A+parvals(k)*B)*x_nom-b);
eps_stoch(k) = norm((A+parvals(k)*B)*x_stoch-b);
eps_wc(k) = norm((A+parvals(k)*B)*x_wc-b);
end
hold on;
plot(parvals, eps_nom, 'b-');
plot(parvals, eps_stoch, 'r-');
plot(parvals, eps_wc, 'g-');
xlabel('$u$', 'Interpreter', 'latex');
ylabel('$r(u)=\Vert A(u)x-b\Vert_2$', 'Interpreter', 'latex');
title('Approximate solutions');
hold off;
Finite Set
Here we have
A
=
{
A
1
,
…
,
A
k
}
\mathcal{A}=\{A_1, \dots, A_k\}
A={A1,…,Ak}, and the robust approximation problem is
min
max
i
=
1
,
…
,
k
∥
A
i
x
−
b
∥
\min\max_{i=1, \dots, k} \lVert A_ix-b\rVert
mini=1,…,kmax∥Aix−b∥
This problem is equivalent to the robust approximation problem with the polyhedral set
A
=
c
o
n
v
{
A
1
,
…
,
A
k
}
\mathcal{A}=conv\{A_1, \dots, A_k\}
A=conv{A1,…,Ak}:
min
sup
{
∥
A
x
−
b
∥
∣
A
∈
c
o
n
v
{
A
1
,
A
2
,
…
,
A
k
}
}
\min\sup\{\lVert Ax-b\rVert\mid A\in conv\{A_1, A_2, \dots, A_k\}\}
minsup{∥Ax−b∥∣A∈conv{A1,A2,…,Ak}}
We can cast the problem in epigraph form as
min
t
s
.
t
.
∥
A
i
x
−
b
∥
≤
t
,
i
=
1
,
…
,
k
\min\quad t\\ s.t.\quad \lVert A_ix-b\rVert\leq t, i=1,\dots, k
mints.t.∥Aix−b∥≤t,i=1,…,k
Norm bound error
Here the uncertainty set
A
\mathcal{A}
A is a norm ball:
A
=
{
A
ˉ
+
U
∣
∥
U
∥
≤
a
}
\mathcal{A}=\{\bar{A}+U\mid\lVert U\rVert\leq a\}
A={Aˉ+U∣∥U∥≤a}
In this case we have
e
w
c
(
x
)
=
sup
{
∥
A
ˉ
x
−
b
+
U
x
∥
∣
∥
U
∥
≤
a
}
e_{wc}(x)=\sup\{\lVert\bar{A}x-b+Ux\rVert\mid\lVert U\rVert\leq a\}
ewc(x)=sup{∥Aˉx−b+Ux∥∣∥U∥≤a}
the supremum in the expression for
e
w
c
(
x
)
e_{wc}(x)
ewc(x) is attained for
U
=
a
u
v
T
U=auv^T
U=auvT, with
{
u
=
A
ˉ
x
−
b
∥
A
ˉ
x
−
b
∥
2
v
=
x
∥
x
∥
2
\begin{cases} u = \frac{\bar{A}x-b}{\lVert\bar{A}x-b\rVert_2}\\ v = \frac{x}{\lVert x\rVert_2} \end{cases}
{u=∥Aˉx−b∥2Aˉx−bv=∥x∥2x
and the resulting worst case error is
e
w
c
=
∥
A
ˉ
x
−
b
∥
2
+
a
∥
x
∥
2
e_{wc}=\lVert\bar{A}x-b\rVert_2+a\lVert x\rVert_2
ewc=∥Aˉx−b∥2+a∥x∥2
which is a regularized norm problem and can be solved as the SOCP
min
t
1
+
a
t
2
s
.
t
.
{
∥
A
ˉ
x
−
b
∥
2
≤
t
1
∥
x
∥
2
≤
t
2
\min t_1+a t_2\\ s.t. \begin{cases} \lVert \bar{A}x-b\rVert_2\leq t_1\\ \lVert x\rVert_2\leq t_2 \end{cases}
mint1+at2s.t.{∥Aˉx−b∥2≤t1∥x∥2≤t2
Uncertainty ellipsoids
We can also describe the variation in
A
A
A by giving an ellipsoid of possible values for each row:
A
=
{
[
a
1
,
…
,
a
m
]
T
∣
a
i
∈
ε
i
,
i
=
1
,
2
,
…
,
m
}
\mathcal{A}=\{[a_1,\dots, a_m]^T\mid a_i\in\varepsilon_i, i=1,2, \dots, m\}
A={[a1,…,am]T∣ai∈εi,i=1,2,…,m}
where
ε
i
=
{
a
ˉ
i
+
P
i
u
∣
∥
u
∥
2
≤
1
}
\varepsilon_i=\{\bar{a}_i+P_iu\mid \lVert u\rVert_2\leq 1\}
εi={aˉi+Piu∣∥u∥2≤1}
We allow
P
i
P_i
Pi to have a nontrivial nullspace, in order to model the situation when the variation in
a
i
a_i
ai is restricted to a subspace.
An explicit expression for the worst-case magnitude of each residual:
sup
a
i
∈
ε
i
∣
a
i
T
x
−
b
i
∣
=
sup
{
∣
a
ˉ
i
T
x
−
b
i
+
(
P
i
u
)
T
x
∣
∣
∥
u
∥
2
≤
1
}
=
∣
a
ˉ
i
T
x
−
b
i
∣
+
∥
P
i
T
x
∥
2
\begin{aligned} \sup_{a_i\in\varepsilon_i}|a_i^Tx-b_i|&=\sup\{|\bar{a}_i^Tx-b_i+(P_iu)^Tx|\mid\lVert u\rVert_2\leq 1\}\\ &=|\bar{a}_i^Tx-b_i|+\lVert P_i^Tx\rVert_2 \end{aligned}
ai∈εisup∣aiTx−bi∣=sup{∣aˉiTx−bi+(Piu)Tx∣∣∥u∥2≤1}=∣aˉiTx−bi∣+∥PiTx∥2
By introducing new variables
t
1
,
…
,
t
m
t_1, \dots, t_m
t1,…,tm, this problem can be reformulated as
min
∥
t
∥
s
.
t
.
{
a
ˉ
i
T
x
−
b
i
+
∥
P
i
T
x
∥
2
≤
t
i
−
a
ˉ
i
T
x
+
b
i
+
∥
P
i
T
x
∥
2
≤
t
i
\min \quad\lVert t\rVert\\ s.t. \begin{cases} \bar{a}_i^Tx-b_i+\lVert P_i^Tx\rVert_2\leq t_i\\ -\bar{a}_i^Tx+b_i+\lVert P_i^Tx\rVert_2\leq t_i \end{cases}
min∥t∥s.t.{aˉiTx−bi+∥PiTx∥2≤ti−aˉiTx+bi+∥PiTx∥2≤ti
Reference
Convex Optimization S.Boyd Page 333