【CVX】Robust approximation

Navigator

Probability distribution

We can impose the constraint that x x x satisfy x ⪰ 0 , 1 T x = 1 x\succeq 0, \mathbf{1}^Tx=1 x⪰0,1Tx=1
min ⁡ ∥ A x − b ∥ s . t . { x ⪰ 0 1 T x = 1 \min \lVert Ax-b\rVert\\ s.t. \begin{cases} x\succeq 0\\ \mathbf{1}^Tx=1 \end{cases} min∥Ax−b∥s.t.{x⪰01Tx=1​

Norm ball constraint

We can add to the basic norm approximation problem the constraint that x x x lie in a norm ball
min ⁡ ∥ A x − b ∥ s . t . ∥ x − x 0 ∥ ≤ d \min \lVert Ax-b\rVert\\ s.t.\quad \lVert x-x_0\rVert\leq d min∥Ax−b∥s.t.∥x−x0​∥≤d
where x 0 x_0 x0​ is prior guess of what the parameter x x x is, and d d d is the maximum plausible deviation of our estimation from our prior guess. And, the constraint ∥ x − x 0 ∥ ≤ d \lVert x-x_0\rVert\leq d ∥x−x0​∥≤d denote a trust region.

Robust approximation

We assume that A A A is a random variable taking values in R m × n \mathbb{R}^{m\times n} Rm×n, with mean A ˉ \bar{A} Aˉ, so we can describe A A A as
A = A ˉ + U A=\bar{A}+U A=Aˉ+U
where U U U is a random matrix with zero mean. Here, the constant matrix A ˉ \bar{A} Aˉ gives the average value of A A A, and U U U describes its statistical variation.

It is natural to use the expected value of ∥ A x − b ∥ \lVert Ax-b\rVert ∥Ax−b∥ as the objective:
min ⁡ E ∥ A x − b ∥ (1) \min \mathbb{E}\lVert Ax-b\rVert\tag{1} minE∥Ax−b∥(1)
We refer to this problem as the stochastic robust approximation problem. It is always a convex optimization problem, but usually not tractable since in most cases it is very difficult to evalute the objective or its derivatives.
When A A A assumes only a finite number of values, i.e.,
p r o b ( A = A i ) = p i , i = 1 , … , k prob(A=A_i)=p_i, \quad i=1, \dots, k prob(A=Ai​)=pi​,i=1,…,k
where A i ∈ R m × n , 1 T p = 1 , p ⪰ 0 A_i\in\mathbb{R}^{m\times n}, \mathbf{1}^Tp=1, p\succeq 0 Ai​∈Rm×n,1Tp=1,p⪰0. And the problem has the form
min ⁡ p 1 ∥ A 1 x − b ∥ + ⋯ + p k ∥ A k x − b ∥ \min\quad p_1\lVert A_1x-b\rVert+\dots+p_k\lVert A_kx-b\rVert minp1​∥A1​x−b∥+⋯+pk​∥Ak​x−b∥
which is often called a sum-of-norms problem. It can be expressed as
min ⁡ p T t s . t . ∥ A x i − b ∥ ≤ t i , i = 1 , … , k \min \quad p^Tt\\ s.t.\quad \lVert Ax_i-b\rVert\leq t_i, \quad i=1, \dots, k minpTts.t.∥Axi​−b∥≤ti​,i=1,…,k
If the norm is the Euclidean norm, this sum-of-norms problem is an SOCP. If the norm is the l 1 l_1 l1​ or l ∞ l_\infty l∞​ norm, the sum-of-norms problem can be expressed as an LP.

Example: Consider the statistical robust least-squares problem
min ⁡ E ∥ A x − b ∥ 2 2 \min\quad \mathbb{E}\lVert Ax-b\rVert_2^2 minE∥Ax−b∥22​
where the norm is the Euclidean norm. We can express the objective as
E ∥ A x − b ∥ 2 2 = E ( A ˉ x − b + U x ) T ( A ˉ x − b + U x ) = ( A ˉ x − b ) T ( A ˉ x − b ) + E x T U T U x = ∥ A ˉ x − b ∥ 2 2 + x T P x \begin{aligned} \mathbb{E}\lVert Ax-b\rVert_2^2&=\mathbb{E}(\bar{A}x-b+Ux)^T(\bar{A}x-b+Ux)\\ &=(\bar{A}x-b)^T(\bar{A}x-b)+\mathbb{E}x^TU^TUx\\ &=\lVert\bar{A}x-b\rVert_2^2+x^TPx \end{aligned} E∥Ax−b∥22​​=E(Aˉx−b+Ux)T(Aˉx−b+Ux)=(Aˉx−b)T(Aˉx−b)+ExTUTUx=∥Aˉx−b∥22​+xTPx​
where P = E U T U P=\mathbb{E}U^TU P=EUTU. Therefore the statistical robust approximation problem has the form of a regularized least-squares problem
min ⁡ ∥ A ˉ x − b ∥ 2 2 + ∥ P 1 / 2 x ∥ 2 2 \min \lVert\bar{A}x-b\rVert_2^2+\lVert P^{1/2}x\rVert_2^2 min∥Aˉx−b∥22​+∥P1/2x∥22​
with solution
x = ( A ˉ T A ˉ + P ) − 1 A ˉ T b x=(\bar{A}^T\bar{A}+P)^{-1}\bar{A}^Tb x=(AˉTAˉ+P)−1AˉTb

Worst-case robust approximation

It is also possible to model the variation in the matrix A A A using a set-based, worst case approach. We describe the uncertainty by a set of possible values for A A A:
A ∈ A ⊆ R m × n A\in\mathcal{A}\subseteq\mathbb{R}^{m\times n} A∈A⊆Rm×n
which assumed to be nonempty and bounded. The associated worst-case error of a candidate approximate solution x ∈ R n x\in\mathbb{R}^n x∈Rn as
e w c ( x ) = sup ⁡ { ∥ A x − b ∥ ∣ A ∈ A } e_{wc}(x)=\sup\{\lVert Ax-b\rVert\mid A\in\mathcal{A}\} ewc​(x)=sup{∥Ax−b∥∣A∈A}
which is always a convex function of x x x. However, the tractability dependes on the norm used and the description of the uncertainty set A \mathcal{A} A.

Comparsion of stochastic and worst-case robust approximation

Consider the least-squares problem
min ⁡ ∥ A ( u ) x − b ∥ 2 2 \min\lVert A(u)x-b\rVert_2^2 min∥A(u)x−b∥22​
where u ∈ R u\in\mathbb{R} u∈R is an uncertain parameter and A ( u ) = A 0 + u A 1 = A + t B A(u)=A_0+uA_1=A+tB A(u)=A0​+uA1​=A+tB.
Consider the three least-squares problem of ∥ ( A + t B ) x − b ∥ 2 \lVert (A+tB)x-b\rVert_2 ∥(A+tB)x−b∥2​:
where t t t is an uncertain parameter in [ − 1 , 1 ] [-1, 1] [−1,1]

  • nominal optimal, t = 0 t=0 t=0
  • stochastic robust approximation, assuming t t t is uniformly distributed on [ − 1 , 1 ] [-1, 1] [−1,1].
    min ⁡ E ∥ ( A + t B ) x − b ∥ 2 = ∥ A ∗ x − b ∥ 2 + x T P x \min \mathbb{E}\lVert (A+tB)x-b\rVert^2=\lVert A*x-b\rVert^2+x^TPx minE∥(A+tB)x−b∥2=∥A∗x−b∥2+xTPx
    where P = E t 2 B T B = ( 1 / 3 ) B T B P=\mathbb{E}t^2B^TB=(1/3)B^TB P=Et2BTB=(1/3)BTB
  • worst case robust approximation
    sup ⁡ − 1 ≤ t ≤ 1 ∥ A ( u ) x − b ∥ 2 = max ⁡ { ∥ ( A − B ) x − b ∥ 2 , ∥ ( A + B ) x − b ∥ 2 } \sup_{-1\leq t\leq 1}\lVert A(u)x-b\rVert_2=\max\{\lVert(A-B)x-b\rVert_2, \lVert(A+B)x-b\Vert_2\} −1≤t≤1sup​∥A(u)x−b∥2​=max{∥(A−B)x−b∥2​,∥(A+B)x−b∥2​}

In an optimal design setting, the variation can represent uncertainty of the linear equations that relate the design variables x x x to the results vector A x Ax Ax.

cvx code

%% comparison of stochastic and worst-case robust
rng(0);
m = 20;
n = 10;
A = randn(m, n);
[U, S, V]=svd(A);
S = diag(logspace(-1, 1, n));
A = U(:, 1:n)*S*V';

B = randn(m, n);
B = B/norm(B);
b = randn(m, 1);

%% cvx: nominal optimal solution
fprintf(1, '(1). the nominal problem ... \n');

cvx_begin quiet
    variable x_nom(n)
    minimize (norm(A*x_nom-b))
cvx_end

%% cvx: stochastic robust approximation
fprintf(1, '(2). the stochastic robust approximation problem ... \n');

P = (1/3)*(B'*B);
cvx_begin quiet
    variable x_stoch(n)
    minimize (square_pos(norm(A*x_stoch - b)) + quad_form(x_stoch, P))
cvx_end

%% cvx: worst-case robust approximation
fprintf(1, '(3). the worst case robust approximation problem ... \n');

cvx_begin quiet
    variable x_wc(n)
    minimize (max(norm((A-B)*x_wc-b), norm((A+B)*x_wc-b)))
cvx_end

%% plot residuals
nobs = 100;
parvals = linspace(-2, 2, nobs);

eps_nom = zeros(1, nobs);
eps_stoch = zeros(1, nobs);
eps_wc = zeros(1, nobs);

for k=1:nobs
    eps_nom(k) = norm((A+parvals(k)*B)*x_nom-b);
    eps_stoch(k) = norm((A+parvals(k)*B)*x_stoch-b);
    eps_wc(k) = norm((A+parvals(k)*B)*x_wc-b);
end

hold on;
plot(parvals, eps_nom, 'b-');
plot(parvals, eps_stoch, 'r-');
plot(parvals, eps_wc, 'g-');
xlabel('$u$', 'Interpreter', 'latex');
ylabel('$r(u)=\Vert A(u)x-b\Vert_2$', 'Interpreter', 'latex');
title('Approximate solutions');
hold off;

【CVX】Robust approximation

Finite Set

Here we have A = { A 1 , … , A k } \mathcal{A}=\{A_1, \dots, A_k\} A={A1​,…,Ak​}, and the robust approximation problem is
min ⁡ max ⁡ i = 1 , … , k ∥ A i x − b ∥ \min\max_{i=1, \dots, k} \lVert A_ix-b\rVert mini=1,…,kmax​∥Ai​x−b∥
This problem is equivalent to the robust approximation problem with the polyhedral set A = c o n v { A 1 , … , A k } \mathcal{A}=conv\{A_1, \dots, A_k\} A=conv{A1​,…,Ak​}:
min ⁡ sup ⁡ { ∥ A x − b ∥ ∣ A ∈ c o n v { A 1 , A 2 , … , A k } } \min\sup\{\lVert Ax-b\rVert\mid A\in conv\{A_1, A_2, \dots, A_k\}\} minsup{∥Ax−b∥∣A∈conv{A1​,A2​,…,Ak​}}

We can cast the problem in epigraph form as
min ⁡ t s . t . ∥ A i x − b ∥ ≤ t , i = 1 , … , k \min\quad t\\ s.t.\quad \lVert A_ix-b\rVert\leq t, i=1,\dots, k mints.t.∥Ai​x−b∥≤t,i=1,…,k

Norm bound error

Here the uncertainty set A \mathcal{A} A is a norm ball:
A = { A ˉ + U ∣ ∥ U ∥ ≤ a } \mathcal{A}=\{\bar{A}+U\mid\lVert U\rVert\leq a\} A={Aˉ+U∣∥U∥≤a}
In this case we have
e w c ( x ) = sup ⁡ { ∥ A ˉ x − b + U x ∥ ∣ ∥ U ∥ ≤ a } e_{wc}(x)=\sup\{\lVert\bar{A}x-b+Ux\rVert\mid\lVert U\rVert\leq a\} ewc​(x)=sup{∥Aˉx−b+Ux∥∣∥U∥≤a}
the supremum in the expression for e w c ( x ) e_{wc}(x) ewc​(x) is attained for U = a u v T U=auv^T U=auvT, with
{ u = A ˉ x − b ∥ A ˉ x − b ∥ 2 v = x ∥ x ∥ 2 \begin{cases} u = \frac{\bar{A}x-b}{\lVert\bar{A}x-b\rVert_2}\\ v = \frac{x}{\lVert x\rVert_2} \end{cases} {u=∥Aˉx−b∥2​Aˉx−b​v=∥x∥2​x​​
and the resulting worst case error is
e w c = ∥ A ˉ x − b ∥ 2 + a ∥ x ∥ 2 e_{wc}=\lVert\bar{A}x-b\rVert_2+a\lVert x\rVert_2 ewc​=∥Aˉx−b∥2​+a∥x∥2​
which is a regularized norm problem and can be solved as the SOCP
min ⁡ t 1 + a t 2 s . t . { ∥ A ˉ x − b ∥ 2 ≤ t 1 ∥ x ∥ 2 ≤ t 2 \min t_1+a t_2\\ s.t. \begin{cases} \lVert \bar{A}x-b\rVert_2\leq t_1\\ \lVert x\rVert_2\leq t_2 \end{cases} mint1​+at2​s.t.{∥Aˉx−b∥2​≤t1​∥x∥2​≤t2​​

Uncertainty ellipsoids

We can also describe the variation in A A A by giving an ellipsoid of possible values for each row:
A = { [ a 1 , … , a m ] T ∣ a i ∈ ε i , i = 1 , 2 , … , m } \mathcal{A}=\{[a_1,\dots, a_m]^T\mid a_i\in\varepsilon_i, i=1,2, \dots, m\} A={[a1​,…,am​]T∣ai​∈εi​,i=1,2,…,m}
where
ε i = { a ˉ i + P i u ∣ ∥ u ∥ 2 ≤ 1 } \varepsilon_i=\{\bar{a}_i+P_iu\mid \lVert u\rVert_2\leq 1\} εi​={aˉi​+Pi​u∣∥u∥2​≤1}
We allow P i P_i Pi​ to have a nontrivial nullspace, in order to model the situation when the variation in a i a_i ai​ is restricted to a subspace.
An explicit expression for the worst-case magnitude of each residual:
sup ⁡ a i ∈ ε i ∣ a i T x − b i ∣ = sup ⁡ { ∣ a ˉ i T x − b i + ( P i u ) T x ∣ ∣ ∥ u ∥ 2 ≤ 1 } = ∣ a ˉ i T x − b i ∣ + ∥ P i T x ∥ 2 \begin{aligned} \sup_{a_i\in\varepsilon_i}|a_i^Tx-b_i|&=\sup\{|\bar{a}_i^Tx-b_i+(P_iu)^Tx|\mid\lVert u\rVert_2\leq 1\}\\ &=|\bar{a}_i^Tx-b_i|+\lVert P_i^Tx\rVert_2 \end{aligned} ai​∈εi​sup​∣aiT​x−bi​∣​=sup{∣aˉiT​x−bi​+(Pi​u)Tx∣∣∥u∥2​≤1}=∣aˉiT​x−bi​∣+∥PiT​x∥2​​
By introducing new variables t 1 , … , t m t_1, \dots, t_m t1​,…,tm​, this problem can be reformulated as
min ⁡ ∥ t ∥ s . t . { a ˉ i T x − b i + ∥ P i T x ∥ 2 ≤ t i − a ˉ i T x + b i + ∥ P i T x ∥ 2 ≤ t i \min \quad\lVert t\rVert\\ s.t. \begin{cases} \bar{a}_i^Tx-b_i+\lVert P_i^Tx\rVert_2\leq t_i\\ -\bar{a}_i^Tx+b_i+\lVert P_i^Tx\rVert_2\leq t_i \end{cases} min∥t∥s.t.{aˉiT​x−bi​+∥PiT​x∥2​≤ti​−aˉiT​x+bi​+∥PiT​x∥2​≤ti​​

Reference

Convex Optimization S.Boyd Page 333

上一篇:卜若的代码笔记-webgl系列-第二十三章:glsl的应用(三)>尽可能的去掉白边


下一篇:Tips-nan