UA SIE545 优化理论基础2 凸函数 概念 理论 总结

UA SIE545 优化理论基础2 凸函数 概念 理论 总结


凸函数的概念与简单性质

Convex function f : S → R f:S \to \mathbb{R} f:S→R where S S S is a nonempty convex set. Call f f f a convex function on S S S if ∀ x 1 , x 2 ∈ S \forall x_1,x_2 \in S ∀x1​,x2​∈S, λ ∈ ( 0 , 1 ) \lambda \in (0,1) λ∈(0,1)
f ( λ x 1 + ( 1 − λ ) x 2 ) ≤ λ f ( x 1 ) + ( 1 − λ ) f ( x 2 ) f(\lambda x_1+(1-\lambda)x_2) \le \lambda f(x_1)+(1-\lambda)f(x_2) f(λx1​+(1−λ)x2​)≤λf(x1​)+(1−λ)f(x2​)

Call f f f a strictly convex function on S S S if ∀ x 1 , x 2 ∈ S \forall x_1,x_2 \in S ∀x1​,x2​∈S, λ ∈ ( 0 , 1 ) \lambda \in (0,1) λ∈(0,1)
f ( λ x 1 + ( 1 − λ ) x 2 ) < λ f ( x 1 ) + ( 1 − λ ) f ( x 2 ) f(\lambda x_1+(1-\lambda)x_2) < \lambda f(x_1)+(1-\lambda)f(x_2) f(λx1​+(1−λ)x2​)<λf(x1​)+(1−λ)f(x2​)

Call f f f a (strictly) concave function on S S S if − f -f −f is (strictly) convex.

Level set

  1. Lower-level set: S α = { x ∈ S : f ( x ) ≤ α } S_{\alpha} = \{x \in S:f(x) \le \alpha\} Sα​={x∈S:f(x)≤α}
  2. Upper-level set: { x ∈ S : f ( x ) ≥ α } \{x \in S:f(x) \ge \alpha\} {x∈S:f(x)≥α}

Properties

  1. If f f f is a convex function, S α S_{\alpha} Sα​ is a convex set.
  2. f ∈ C ( i n t S ) f \in C(int S) f∈C(intS) ( f f f is continuous on the interior of S S S)
  3. For R n \mathbb{R}^n Rn convex function f f f, any non-zero directional derivative exists. ∃ lim ⁡ λ → 0 + f ( x ˉ + λ d ) − f ( x ˉ ) λ , x ˉ ∈ S , x ˉ + λ d ∈ S \exists \lim_{\lambda \to 0^+}\frac{f(\bar x+\lambda d)-f(\bar x)}{\lambda},\bar x \in S,\bar x + \lambda d \in S ∃λ→0+lim​λf(xˉ+λd)−f(xˉ)​,xˉ∈S,xˉ+λd∈S

次梯度(sub-gradient)

epigraph f : S → R f:S \to \mathbb{R} f:S→R where S S S is a nonempty convex set.
e p i   f = { ( x , y ) : x ∈ S , y ∈ R , y ≥ f ( x ) } epi \ f = \{(x,y):x \in S, y \in \mathbb{R}, y \ge f(x)\} epi f={(x,y):x∈S,y∈R,y≥f(x)}

hypograph f : S → R f:S \to \mathbb{R} f:S→R where S S S is a nonempty convex set.
h y p   f = { ( x , y ) : x ∈ S , y ∈ R , y ≤ f ( x ) } hyp \ f = \{(x,y):x \in S, y \in \mathbb{R}, y \le f(x)\} hyp f={(x,y):x∈S,y∈R,y≤f(x)}

Property f : S → R f:S \to \mathbb{R} f:S→R where S S S is a nonempty convex set. f f f is convex iff e p i   f epi \ f epi f is convex.

subgradient f : S → R f:S \to \mathbb{R} f:S→R is convex where S S S is a nonempty convex set. Then ξ \xi ξ is called subgradient of f f f at x ˉ \bar x xˉ if
f ( x ) ≥ f ( x ˉ ) + ξ T ( x − x ˉ ) , ∀ x ∈ S f(x) \ge f(\bar x)+\xi^T(x-\bar x),\forall x \in S f(x)≥f(xˉ)+ξT(x−xˉ),∀x∈S

f : S → R f:S \to \mathbb{R} f:S→R is concave where S S S is a nonempty convex set. Then ξ \xi ξ is called subgradient of f f f at x ˉ \bar x xˉ if
f ( x ) ≤ f ( x ˉ ) + ξ T ( x − x ˉ ) , ∀ x ∈ S f(x) \le f(\bar x)+\xi^T(x-\bar x),\forall x \in S f(x)≤f(xˉ)+ξT(x−xˉ),∀x∈S

Property

  1. f : S → R f:S \to \mathbb{R} f:S→R is convex where S S S is a nonempty convex set. ∀ x ˉ ∈ i n t S \forall \bar x \in intS ∀xˉ∈intS, ∃ ξ \exists \xi ∃ξ such that
    f ( x ) ≥ f ( x ˉ ) + ξ T ( x − x ˉ ) , ∀ x ∈ S f(x) \ge f(\bar x)+\xi^T(x-\bar x),\forall x \in S f(x)≥f(xˉ)+ξT(x−xˉ),∀x∈Sand the hyperplane
    H = { ( x , y ) : y = f ( x ˉ ) + ξ T ( x − x ˉ ) } H = \{(x,y):y = f(\bar x)+\xi^T(x-\bar x)\} H={(x,y):y=f(xˉ)+ξT(x−xˉ)}supports e p i   f epi\ f epi f at ( x ˉ , f ( x ˉ ) ) (\bar x,f(\bar x)) (xˉ,f(xˉ)).
  2. f : S → R f:S \to \mathbb{R} f:S→R is strictly convex where S S S is a nonempty convex set. ∀ x ˉ ∈ i n t S \forall \bar x \in intS ∀xˉ∈intS, ∃ ξ \exists \xi ∃ξ such that
    f ( x ) > f ( x ˉ ) + ξ T ( x − x ˉ ) , ∀ x ∈ S f(x)> f(\bar x)+\xi^T(x-\bar x),\forall x \in S f(x)>f(xˉ)+ξT(x−xˉ),∀x∈S
  3. f : S → R f:S \to \mathbb{R} f:S→R where S S S is a nonempty convex set. If ∀ x ˉ ∈ i n t S \forall \bar x \in intS ∀xˉ∈intS, ∃ ξ \exists \xi ∃ξ such that
    f ( x ) ≥ f ( x ˉ ) + ξ T ( x − x ˉ ) , ∀ x ∈ S f(x) \ge f(\bar x)+\xi^T(x-\bar x),\forall x \in S f(x)≥f(xˉ)+ξT(x−xˉ),∀x∈Sthen f f f is convex.

可微的凸函数

differentiable f : S → R f:S \to \mathbb{R} f:S→R where S S S is a nonempty set. Say f f f is differentiable at x ˉ ∈ i n t S \bar x \in int S xˉ∈intS if ∃ β \exists \beta ∃β such that
f ( x ) = f ( x ˉ ) + β T ( x − x ˉ ) + o ( ∥ x − x ˉ ∥ ) β = ∇ f ( x ˉ ) f(x)=f(\bar x)+\beta^T(x-\bar x)+o(\left\| x-\bar x \right\|) \\ \beta = \nabla f(\bar x) f(x)=f(xˉ)+βT(x−xˉ)+o(∥x−xˉ∥)β=∇f(xˉ)

Property

  1. f : S → R f:S \to \mathbb{R} f:S→R is convex where S S S is a nonempty convex set. If f f f is differentiable at x ˉ \bar x xˉ, then ∇ f ( x ˉ ) \nabla f(\bar x) ∇f(xˉ) is subgradient.
  2. f : S → R f:S \to \mathbb{R} f:S→R where S S S is a nonempty convex set. f f f is differentiable at x ˉ \bar x xˉ, then f f f is convex iff ∀ x ˉ ∈ S \forall \bar x \in S ∀xˉ∈S, f ( x ) ≥ f ( x ˉ ) + ∇ f ( x ˉ ) T ( x − x ˉ ) , ∀ x ∈ S f(x) \ge f(\bar x)+\nabla f(\bar x)^T(x-\bar x),\forall x \in S f(x)≥f(xˉ)+∇f(xˉ)T(x−xˉ),∀x∈S
  3. f : S → R f:S \to \mathbb{R} f:S→R where S S S is a nonempty convex set. f f f is differentiable on S S S, then f f f is convex iff ∀ x 1 , x 2 ∈ S \forall x_1, x_2 \in S ∀x1​,x2​∈S, [ ∇ f ( x 2 ) − ∇ f ( x 1 ) ] T ( x 2 − x 1 ) ≥ 0 [\nabla f(x_2)-\nabla f(x_1)]^T(x_2-x_1) \ge 0 [∇f(x2​)−∇f(x1​)]T(x2​−x1​)≥0

PSD positive semi-definite, ∀ x ∈ R n \forall x \in \mathbb{R}^n ∀x∈Rn, x T H x ≥ 0 x^THx \ge 0 xTHx≥0
PD positive definite, ∀ x ∈ R n \forall x \in \mathbb{R}^n ∀x∈Rn, x T H x > 0 x^THx > 0 xTHx>0

Tips for checking PSD/PD

  1. ∀ i , H i i < 0 \forall i,H_{ii}<0 ∀i,Hii​<0, H H H is not PSD; ∀ i , H i i ≤ 0 \forall i,H_{ii} \le 0 ∀i,Hii​≤0, H H H is not PD;
  2. main sub-matrix is PSD/PD then H H H is PSD/PD
  3. if H T = H H^T=H HT=H, PD = PSD+nonsingular
  4. if H H H is 2 × 2 2 \times 2 2×2, H 11 > 0 , H 22 > 0 , ∣ H ∣ > 0 H_{11}>0,H_{22}>0,|H|>0 H11​>0,H22​>0,∣H∣>0 means H H H is PD

Property

  1. f : S → R f:S \to \mathbb{R} f:S→R where S S S is a nonempty convex set. f f f is twice differentiable on S S S, then f f f is convex iff H f ( x ˉ ) Hf(\bar x) Hf(xˉ) is PSD.
  2. f : S → R f:S \to \mathbb{R} f:S→R where S S S is a nonempty convex set. f f f is twice differentiable on S S S, then if H f ( x ˉ ) Hf(\bar x) Hf(xˉ) is PD, f f f is strictly convex; if f f f is strictly convex, H f ( x ˉ ) Hf(\bar x) Hf(xˉ) is PSD; if f f f is strictly convex and quadratic, H f ( x ˉ ) Hf(\bar x) Hf(xˉ) is PD
  3. f : S → R f:S \to \mathbb{R} f:S→R where S S S is a nonempty convex set. f f f is infinitely differentiable on S S S, then f f f is strictly convex at x ˉ \bar x xˉ iff f ( n ) ( x ˉ ) > 0 f^{(n)}(\bar x)>0 f(n)(xˉ)>0 and f ( j ) ( x ) = 0 , ∀ 1 < j < n f^{(j)}(x) = 0,\forall 1< j<n f(j)(x)=0,∀1<j<n

凸函数的推广

quasiconvex f : S → R f:S \to \mathbb{R} f:S→R where S S S is a nonempty convex set. ∀ x 1 , x 2 ∈ S \forall x_1,x_2 \in S ∀x1​,x2​∈S, f ( λ x 1 + ( 1 − λ ) x 2 ) ≤ max ⁡ ( f ( x 1 ) , f ( x 2 ) ) f(\lambda x_1+(1-\lambda)x_2) \le \max(f(x_1),f(x_2)) f(λx1​+(1−λ)x2​)≤max(f(x1​),f(x2​))

Property

  1. f f f is quasiconvex iff ∀ α \forall \alpha ∀α, S α S_{\alpha} Sα​ is convex
  2. S S S is nonempty compact polyhedral then ∃ x ˉ \exists \bar x ∃xˉ that is both an extreme point of S S S and also the optimal solution to min ⁡ S f ( x ) \min_S f(x) minS​f(x)
  3. S S S is open convex, f f f is differentiable. f f f is quasiconvex iff one of the following is correct: if x 1 , x 2 ∈ S x_1,x_2 \in S x1​,x2​∈S, f ( x 1 ) ≤ f ( x 2 ) f(x_1) \le f(x_2) f(x1​)≤f(x2​), then ∇ f ( x 2 ) T ( x 1 − x 2 ) ≤ 0 \nabla f(x_2)^T(x_1-x_2) \le 0 ∇f(x2​)T(x1​−x2​)≤0 or if x 1 , x 2 ∈ S x_1,x_2 \in S x1​,x2​∈S, ∇ f ( x 2 ) T ( x 1 − x 2 ) ≤ 0 \nabla f(x_2)^T(x_1-x_2) \le 0 ∇f(x2​)T(x1​−x2​)≤0, then f ( x 1 ) ≤ f ( x 2 ) f(x_1) \le f(x_2) f(x1​)≤f(x2​)

strict quasiconvex f : S → R f:S \to \mathbb{R} f:S→R where S S S is a nonempty convex set. ∀ x 1 , x 2 ∈ S \forall x_1,x_2 \in S ∀x1​,x2​∈S, f ( x 1 ) ≠ f ( x 2 ) f(x_1) \ne f(x_2) f(x1​)​=f(x2​), f ( λ x 1 + ( 1 − λ ) x 2 ) < max ⁡ ( f ( x 1 ) , f ( x 2 ) ) f(\lambda x_1+(1-\lambda)x_2) < \max(f(x_1),f(x_2)) f(λx1​+(1−λ)x2​)<max(f(x1​),f(x2​))

Property

  1. If x ˉ \bar x xˉ is local minimum to min ⁡ S f ( x ) \min_S f(x) minS​f(x) where S S S is convex, then x ˉ \bar x xˉ is global minimum.

strong quasiconvex
f : S → R f:S \to \mathbb{R} f:S→R where S S S is a nonempty convex set. ∀ x 1 , x 2 ∈ S \forall x_1,x_2 \in S ∀x1​,x2​∈S, x 1 ≠ x 2 x_1 \ne x_2 x1​​=x2​, f ( λ x 1 + ( 1 − λ ) x 2 ) < max ⁡ ( f ( x 1 ) , f ( x 2 ) ) f(\lambda x_1+(1-\lambda)x_2) < \max(f(x_1),f(x_2)) f(λx1​+(1−λ)x2​)<max(f(x1​),f(x2​))

Property

  1. strong quasiconvex leads to strict quasiconvex
  2. If x ˉ \bar x xˉ is local minimum to min ⁡ S f ( x ) \min_S f(x) minS​f(x) where S S S is convex, then x ˉ \bar x xˉ is unique global minimum.

Pseudoconvex ∀ x 1 , x 2 ∈ X \forall x_1,x_2 \in X ∀x1​,x2​∈X such that ∇ f ( x 1 ) T ( x 2 − x 1 ) ≥ 0 \nabla f(x_1)^T(x_2-x_1) \ge 0 ∇f(x1​)T(x2​−x1​)≥0, we have f ( x 2 ) ≥ f ( x 1 ) f(x_2) \ge f(x_1) f(x2​)≥f(x1​).

Property

  1. Pseudoconvex + differentiable = strict quasiconvex
  2. Strict Pseudoconvex + differentiable = strong quasiconvex
上一篇:ZT C/C++变量命名规则,个人习惯总结


下一篇:Flink(scala)整合MySQL统计UV(unique visitor)