【求解变分不等式】

发布时间:2024年01月07日

介绍

θ : R n → ( ? ∞ , ∞ ] \theta:R^n → (-\infty,\infty] θ:Rn(?,] 为闭正常凸函数, F : R n → R n F:R^n → R^n F:RnRn为向量
值和连续映射。广义变分不等式 ( G V I ) (GVI) (GVI)[9,19]形式如下 x ? ∈ R n , θ ( x ) ? θ ( x ? ) + ( x ? x ? ) T F ( x ? ) ≥ 0 , ? x ∈ R n . x^*\in R^n,\theta(x)-\theta(x^*)+(x-x^*)^TF(x^*)\ge 0,\forall x\in R^n. x?Rn,θ(x)?θ(x?)+(x?x?)TF(x?)0,?xRn.
保留 ∞ ? ∞ = ∞ \infty -\infty=\infty ?=的算术运算。在本文中,我们关注 F F F是单调的情况,即 ( F ( x ) ? F ( y ) ) T ( x ? y ) ≥ 0 (F(x) -F(y))^T(x -y)\ge 0 (F(x)?F(y))T(x?y)0,对于所有 x , y ∈ R n x, y \in R^n x,yRn。在本文中,我们对 M G V I ( θ , F ) MGVI(\theta, F) MGVI(θ,F)做了以下额外的假设:
( a ) (a) (a)单调算子 F F F L i p s c h i t z Lipschitz Lipschitz连续的,其中 L i p s c h i t z Lipschitz Lipschitz 常数 L > 0 L > 0 L>0
( b ) (b) (b) M G V I ( θ , F ) MGVI(\theta ,F) MGVI(θ,F)的解集,表示为 Ω ? \Omega^* Ω?,非空。

各种凸优化问题可以表述为 M G V I s MGVIs MGVIs,例如最小绝对收缩和选择算子(LASSO,least absolute shrinkage and selection operator/ l 1 l_1 l1?正则化最小二乘)问题、基追问题[7]、基追去噪问题[5],和 D a n t z i g Dantzig Dantzig 选择器 [6],等等。此外,单调广义变分不等式( M G V I MGVI MGVI,monotone generalized variational inequality)包含经典单调变分不等式 ( M V I MVI MVI,monotone variational inequality)。通过设置 θ = δ C \theta = \delta _C θ=δC?( δ C \delta _C δC? C C C的指标函数,如果是 x ∈ C x \in C xC,则等于 0 0 0,否则等于 ∞ \infty 。),其中 C ? R n C \subseteq R^n C?Rn是一个非空闭凸集,则 M G V I MGVI MGVI 就以下列形式简化为 M V I MVI MVI
x ? ∈ C , ( x ? x ? ) T F ( x ? ) ≥ 0 , ? x ∈ C . x^*\in C,(x-x^*)^TF(x^*)\ge 0,\forall x\in C. x?C,(x?x?)TF(x?)0,?xC.
f : R n → ( ? ∞ , ∞ ] f: R^n → (?∞, ∞] f:Rn(?,]为闭正常凸函数, f f f的邻近算子定义如下
P r o x f : x ? a r g m i n { f ( y ) + 1 2 ∥ x ? y ∥ 2 : y ∈ R n } Prox_f:x \mapsto argmin\{f(y)+\frac{1}{2}\|x-y\|^2 : y \in R^n\} Proxf?:x?argmin{f(y)+21?x?y2:yRn}

f : R n → ( ? ∞ , ∞ ] f: R^n \rightarrow (-\infty, \infty] f:Rn(?,]为闭正常凸函数。那么接下来的三个条件是等价的:
1. p = P r o x f ( x ) . 1. p = Prox_f (x). 1.p=Proxf?(x).
2. x ? p ∈ ? f ( p ) . 2. x - p \in \partial f(p). 2.x?p?f(p).
3. 3. 3. 对于所有 w ∈ R n w\in R^n wRn,都有 ( x ? p ) T ( p ? w ) ≥ f ( p ) ? f ( w ) . (x-p)^T(p-w) \ge f(p) - f(w). (x?p)T(p?w)f(p)?f(w).

l 1 l_1 l1?正则化最小二乘问题

对线性逆问题:
b = A x + w b=Ax+w b=Ax+w,其中 b ∈ R m , A ∈ R m × n b\in R^m,A\in R^{m\times n} bRm,ARm×n已知, w w w为噪音,求解 x x x

让噪声最小,最小二乘法求解:
x ~ = arg ? min ? x ∥ A x ? b ∥ 2 \widetilde{x}=\mathop{\arg\min}\limits_{x}\|Ax-b\|^2 x =xargmin?Ax?b2
m = n m=n m=n A A A非奇异, x = A ? 1 y x=A^{-1}y x=A?1y。但是很多情况下, A A A病态,用最小二乘法求解时,系统微小的扰动都会导致结果差别很大。为了求解病态线性系统的逆问题,可以使用 l 2 l_2 l2?正则化最小二乘(也叫Tikhonov regularization,Ridge regression)
x ~ = arg ? min ? x { 1 2 ∥ A x ? b ∥ 2 + λ ∥ x ∥ 2 } \widetilde{x}=\mathop{\arg\min}\limits_{x}\{\frac{1}{2}\|Ax-b\|^2+\lambda\|x\|_2\} x =xargmin?{21?Ax?b2+λx2?}
m i n x ∈ R n f ( x ) : = { 1 2 ∥ A x ? b ∥ 2 + λ ∥ x ∥ 2 } \mathop{min}\limits_{x\in R^n}f(x):=\{\frac{1}{2}\|Ax-b\|^2+\lambda\|x\|_2\} xRnmin?f(x):={21?Ax?b2+λx2?}

l 1 l_1 l1?正则化最小二乘(Lasso,least absolute shrinkage and selection operator,最小绝对值收敛和选择算子)问题。
m i n x ∈ R n f ( x ) : = { 1 2 ∥ A x ? b ∥ 2 + λ ∥ x ∥ 1 } \mathop{min}\limits_{x\in R^n}f(x):=\{\frac{1}{2}\|Ax-b\|^2+\lambda\|x\|_1\} xRnmin?f(x):={21?Ax?b2+λx1?}
对无约束问题 m i n f ( x ) , x ∈ R n minf(x),x\in R^n minf(x)xRn
梯度下降法 x k = x k ? 1 ? t k ? 1 ? f ( x k ? 1 ) x_k=x_{k-1}-t_{k-1}\nabla f(x_{k-1}) xk?=xk?1??tk?1??f(xk?1?)
梯度下降法对可微函数可直接求微分,对不可微函数引入了次梯度(subgradient),次微分(subdifferential)概念
投影算子(projection,Pc)
P c = P r o x t k δ C Pc=Prox_{t_k\delta _C} Pc=Proxtk?δC??
x k = P c ( x k ? 1 ? t k ? 1 ? f ( x k ? 1 ) ) x_k=Pc(x_{k-1}-t_{k-1}\nabla f(x_{k-1})) xk?=Pc(xk?1??tk?1??f(xk?1?))
近端线性 x k = arg ? min ? x { f ( x k ? 1 ) + < x ? x k ? 1 , ? f ( x k ? 1 ) > + 1 2 t k ? 1 ∥ x ? x k ? 1 ∥ 2 } x_k=\mathop{\arg\min}\limits_{x}\{f(x_{k-1})+<x-x_{k-1},\nabla f(x_{k-1})>+\frac{1}{2t_{k-1}}\|x-x_{k-1}\|^2\} xk?=xargmin?{f(xk?1?)+<x?xk?1?,?f(xk?1?)>+2tk?1?1?x?xk?1?2}
f ( x k ? 1 ) + < x ? x k ? 1 , ? f ( x k ? 1 ) > + 1 2 t k ? 1 ∥ x ? x k ? 1 ∥ 2 = 1 2 t k ? 1 ∥ x ? ( x k ? 1 ? t k ? 1 ? f ( x k ? 1 ) ) ∥ 2 + D f(x_{k-1})+<x-x_{k-1},\nabla f(x_{k-1})>+\frac{1}{2t_{k-1}}\|x-x_{k-1}\|^2=\frac{1}{2t_{k-1}}\|x-(x_{k-1}-t_{k-1}\nabla f(x_{k-1}))\|^2+D f(xk?1?)+<x?xk?1?,?f(xk?1?)>+2tk?1?1?x?xk?1?2=2tk?1?1?x?(xk?1??tk?1??f(xk?1?))2+D
D为常数
x k = arg ? min ? x { 1 2 t k ∥ x ? ( x k ? 1 ? t k ? 1 ? f ( x k ? 1 ) ) ∥ 2 } x_k=\mathop{\arg\min}\limits_{x}\{\frac{1}{2t_k}\|x-(x_{k-1}-t_{k-1}\nabla f(x_{k-1}))\|^2\} xk?=xargmin?{2tk?1?x?(xk?1??tk?1??f(xk?1?))2}
对无约束问题 m i n f ( x ) + g ( x ) , x ∈ R n minf(x)+g(x),x\in R^n minf(x)+g(x)xRn
x k + 1 = arg ? min ? x { f ( x k ) + < x ? x k , ? f ( x k ) > + 1 2 t k ∥ x ? x k ∥ 2 + g ( x ) } x_{k+1}=\mathop{\arg\min}\limits_{x}\{f(x_k)+<x-x_k,\nabla f(x_k)>+\frac{1}{2t_k}\|x-x_k\|^2+g(x)\} xk+1?=xargmin?{f(xk?)+<x?xk?,?f(xk?)>+2tk?1?x?xk?2+g(x)}
x k + 1 = arg ? min ? x { 1 2 ∥ x ? ( x k ? t k ? f ( x k ) ) ∥ 2 + t k g ( x ) } x_{k+1}=\mathop{\arg\min}\limits_{x}\{\frac{1}{2}\|x-(x_k-t_k\nabla f(x_k))\|^2+t_kg(x)\} xk+1?=xargmin?{21?x?(xk??tk??f(xk?))2+tk?g(x)}
x k + 1 = P r o x t k g ( x k ? t k ? f ( x k ) ) x_{k+1}=Prox_{t_kg}(x_k-t_k\nabla f(x_k)) xk+1?=Proxtk?g?(xk??tk??f(xk?))

基追问题

m i n ∥ x ∥ 1 s . t . A x = b \mathop{min}\quad \|x\|_1 \\s.t. \quad Ax=b minx1?s.t.Ax=b
其中 A ∈ R m × n , b ∈ R n A\in R^{m\times n},b\in R^{n} ARm×n,bRn

可分离两块凸优化模型

m i n θ 1 ( x ) + θ 2 ( x ) s . t . A x + B y = c \mathop{min}\quad\theta_1(x)+\theta_2(x)\\ s.t. \quad Ax+By=c minθ1?(x)+θ2?(x)s.t.Ax+By=c
其中 A ∈ R m × n , B ∈ R m × q , c ∈ R m A\in R^{m\times n},B\in R^{m\times q},c\in R^m ARm×n,BRm×q,cRm
θ 1 : R n → ( ? ∞ , ∞ ] \theta_1:R^n\rightarrow(-\infty,\infty] θ1?:Rn(?,] θ 2 : R q → ( ? ∞ , ∞ ] \theta_2:R^q\rightarrow(-\infty,\infty] θ2?:Rq(?,]为两个正常闭和凸函数

迭代收缩阈值算法(ISTA,Iterative Shrinkage-Thresholding Algorithm)

ISTA

f ( x ) = 1 2 ∥ A x ? b ∥ 2 , g ( x ) = λ ∥ x ∥ 1 f(x)=\frac{1}{2}\|Ax-b\|^2,g(x)=\lambda \|x\|_1 f(x)=21?Ax?b2,g(x)=λx1?
? f ( x k ) = A T ( A x ? b ) , t k = 1 L k \nabla f(x_k)=A^T(Ax-b),t_k=\frac{1}{L_k} ?f(xk?)=AT(Ax?b),tk?=Lk?1?

x k + 1 = P r o x 1 L k λ ∥ ∥ 1 ( x k ? 1 L k A T ( A x k ? b ) ) x_{k+1}=Prox_{\frac{1}{L_k}\lambda \|\|_1}(x_k-\frac{1}{L_k}A^T(Ax_k-b)) xk+1?=ProxLk?1?λ1??(xk??Lk?1?AT(Axk??b))
L L L ? f ( x ) \nabla f(x) ?f(x) L i p s c h i t z Lipschitz Lipschitz常数, L > λ m a x ( A T A ) L>\lambda_{max}(A^TA) L>λmax?(ATA)
T λ t k = P r o x t k λ ∥ ∥ 1 = P r o x λ t k \Tau _{\lambda t_k}=Prox_{t_k\lambda\|\|_1}=Prox_{\lambda t_k} Tλtk??=Proxtk?λ1??=Proxλtk??
x k + 1 = T λ L k ( x k ? 1 L k A T ( A x k ? b ) ) x_{k+1}=T_{\frac{\lambda}{L_k}}(x_k-\frac{1}{L_k}A^T(Ax_k-b)) xk+1?=TLk?λ??(xk??Lk?1?AT(Axk??b))
收缩阈值算子(Shrinkage-Thresholding operator )
T λ ( x ) = s g n ( x ) m a x ( ∣ x ∣ ? λ , 0 ) = m a x ( 0 , x ? λ ) + m i n ( 0 , x + λ ) \Tau _{\lambda}(x)=sgn(x)max(|x|-\lambda,0)=max(0, x -\lambda) + min(0, x +\lambda) Tλ?(x)=sgn(x)max(x?λ,0)=max(0,x?λ)+min(0,x+λ)
x k x_k xk?收敛速度 O ( 1 t ) O(\frac{1}{t}) O(t1?)

matlab核心代码

T = @(tau, x) max(0, x - tau) + min(0, x + tau);% 收缩阈值函数
x_new =  T(lambda/L0, x_old - (1/L0)*A'*(A*x_old-b));

FISTA改进:

快速迭代收缩阈值算法(FISTA,Fast Iterative Shrinkage-Thresholding Algorithm)

s t e p 0 : step\quad 0: step0:
y 1 = x 0 , t 1 = 1 y_1=x_0,t_1=1 y1?=x0?,t1?=1
s t e p k ( k ≥ 1 ) : step\quad k(k\ge1): stepk(k1):
x k + 1 = T λ L k ( y k ? 1 L k A T ( A x k ? b ) ) x_{k+1}=T_{\frac{\lambda}{L_k}}(y_k-\frac{1}{L_k}A^T(Ax_k-b)) xk+1?=TLk?λ??(yk??Lk?1?AT(Axk??b))
t k + 1 = 1 + 1 + 4 t k 2 2 t_{k+1}=\frac{1+\sqrt{1+4t_k^2}}{2} tk+1?=21+1+4tk2? ??
y k + 1 = x k + ( t k ? 1 t k + 1 ) ( x k ? x k ? 1 ) y_{k+1}=x_k+(\frac{t_k-1}{t_{k+1}})(x_k-x_{k-1}) yk+1?=xk?+(tk+1?tk??1?)(xk??xk?1?)
x k x_k xk?收敛速度 O ( 1 t 2 ) O(\frac{1}{t^2}) O(t21?)

广义外梯度法(GEM,Generalized Extragradient Method)

外梯度法(EM,Extragradient Method)

广义外梯度法(Generalized Extragradient Method)
由外梯度法(EM,Extragradient Method)推广得到,其中EM采用投影算子( P c Pc Pc,projection operator),适用于求解 M V I ( θ , F , C ) MVI(\theta,F,C) MVIθ,F,C;GEM采用邻近算子( P r o x Prox Prox,proximity operator),适用于求解 M G V I ( θ , F ) MGVI(\theta,F) MGVIθ,F,其他步骤基本相同,都是预估校正算法

GEM

l 1 l_1 l1?正则化最小二乘问题,由KKT条件,易知
θ ( x ) = λ ∥ x ∥ 1 , F ( x ) = A T ( A x ? b ) \theta(x)=\lambda\|x\|_1,F(x)=A^T(Ax-b) θ(x)=λx1?,F(x)=AT(Ax?b)
β k ≤ ν L , ν ∈ ( 0 , 1 ) , L \beta^k\le\frac{\nu}{L},\nu\in(0,1),L βkLν?,ν(0,1),L F ( x ) F(x) F(x) L i p s c h i t z Lipschitz Lipschitz常数
GEM是预估校正算法
预测: x ~ k = P r o x β k θ ( x k ? β k F ( x k ) ) \widetilde{x}_k=Prox_{\beta^k\theta}(x_k-\beta^kF(x_k)) x k?=Proxβkθ?(xk??βkF(xk?))
校正: x k + 1 = P r o x β k θ ( x k ? β k F ( x ~ k ) ) x_{k+1}=Prox_{\beta^k\theta}(x_k-\beta^kF(\widetilde{x}_k)) xk+1?=Proxβkθ?(xk??βkF(x k?))

邻近收缩算法 (PCA,proximity and contraction algorithms)

收缩阈值算子( T \Tau T),投影算子( P C P_C PC?)是特殊的邻近算子( P r o x Prox Prox)
邻近算子具有收缩性质,也就是生成的序列 { x k } \{x_k\} {xk?}会收敛

邻近点算法(PPA,proximity point algorithms)

邻近梯度算法(PGA,proximity gradient algorithms)

使用邻近算子(Prox)作为近似梯度
变分不等式
θ ( x ~ k ) ? θ ( x ? ) + ( x ~ k ? x ? ) T F ( x ? ) ≥ 0 \theta(\tilde{x}^k) -\theta(x^*) +(\tilde{x}^k - x^*)^T F(x^*) \ge 0 θ(x~k)?θ(x?)+(x~k?x?)TF(x?)0
单调
( x ~ k ? x ? ) T ( F ( x ~ k ) ? F ( x ? ) ) ≥ 0 (\tilde{x}^k - x^*)^T(F(\tilde{x}^k) -F(x^*))\ge 0 (x~k?x?)T(F(x~k)?F(x?))0

P G A a 1 PGA_{a1} PGAa1?

F ( x ) = M x + q F(x)=Mx+q F(x)=Mx+q
M半正定,不需要对称
x 0 ∈ R n x^0 \in R^n x0Rn,并且 γ ∈ ( 0 , 2 ) , α k ? ≥ 1 ∥ I + β k M T ∥ 2 2 \gamma \in (0,2),\alpha ^*_k\ge\frac{1}{\|I+\beta ^kM^T\|^2_2} γ(0,2),αk??I+βkMT22?1?
预测器: \textbf{预测器:} 预测器:选择 β k > 0 \beta ^k>0 βk>0,且 x ~ k = P r o x β k θ ( x k ? β k F ( x k ) ) \widetilde{x}^k=Prox_{\beta ^k\theta}(x^k-\beta ^kF(x^k)) x k=Proxβkθ?(xk?βkF(xk));
校正器: \textbf{校正器:} 校正器: d ( x k , x ~ k , β k ) = ( I + β k M T ) ( x k ? x ~ k ) d(x^k,\widetilde{x}^k,\beta ^k)=(I+\beta ^kM^T)(x^k-\widetilde{x}^k) d(xk,x k,βk)=(I+βkMT)(xk?x k),并且计算 α k ? = ∥ x k ? x ~ k ∥ 2 ∥ d ( x k , x ~ k , β k ) ∥ 2 \alpha ^*_k=\frac{\|x^k-\widetilde{x}^k\|^2}{\|d(x^k,\widetilde{x}^k,\beta ^k)\|^2} αk??=d(xk,x k,βk)2xk?x k2?。设置 x k + 1 = x k ? γ α k ? d ( x k , x ~ k , β k ) x^{k+1}=x^k-\gamma\alpha ^*_kd(x^k,\widetilde{x}^k,\beta ^k) xk+1=xk?γαk??d(xk,x k,βk)

P G A a 2 PGA_{a2} PGAa2?

F ( x ) = M x + q F(x)=Mx+q F(x)=Mx+q
M对称半正定, G = I + β M G=I+\beta M G=I+βM
x 0 ∈ R n x^0 \in R^n x0Rn β > 0 \beta >0 β>0,并且 γ ∈ ( 0 , 2 ) \gamma \in (0,2) γ(0,2) α k ? ≥ 1 λ m a x ( G ) \alpha ^*_k \ge \frac{1}{\lambda_{max}(G)} αk??λmax?(G)1?
预测器: x ~ k = P r o x β θ ( x k ? β F ( x k ) ) \textbf{预测器:}\widetilde{x}^k=Prox_{\beta\theta}(x^k-\beta F(x^k)) 预测器:x k=Proxβθ?(xk?βF(xk));
校正器: \textbf{校正器:} 校正器: d ( x k , x ~ k ) = x k ? x ~ k d(x^k,\widetilde{x}^k)=x^k-\widetilde{x}^k d(xk,x k)=xk?x k,并且计算 α k ? = ∥ x k ? x ~ k ∥ 2 ∥ d ( x k , x ~ k ) ∥ G 2 \alpha ^*_k=\frac{\|x^k-\widetilde{x}^k\|^2}{\|d(x^k,\widetilde{x}^k)\|^2_G} αk??=d(xk,x k)G2?xk?x k2?,设置 x k + 1 = x k ? γ α k ? d ( x k , x ~ k ) x^{k+1}=x^k-\gamma\alpha ^*_kd(x^k,\widetilde{x}^k) xk+1=xk?γαk??d(xk,x k)

P G A b 1 PGA_{b1} PGAb1?

F ( x ) F(x) F(x)单调
x 0 ∈ R n x^0 \in R^n x0Rn,并且 γ ∈ ( 0 , 2 ) \gamma \in (0,2) γ(0,2) α k ? ≥ 1 2 \alpha ^*_k \ge\frac{1}{2} αk??21?
预测器: \textbf{预测器:} 预测器:选择 β k > 0 \beta ^k>0 βk>0,使得 β k ∥ F ( x k ) ? F ( x ~ k ) ∥ ≤ ν ∥ x k ? x ~ k ∥ \beta ^k\|F(x^k)-F(\widetilde{x}^k)\|\le\nu\|x^k-\widetilde{x}^k\| βkF(xk)?F(x k)νxk?x k,其中 x ~ k = P r o x β k θ ( x k ? β k F ( x k ) ) \widetilde{x}^k=Prox_{\beta ^k\theta}(x^k-\beta ^kF(x^k)) x k=Proxβkθ?(xk?βkF(xk));
校正器: \textbf{校正器:} 校正器:设置 d ( x k , x ~ k , β k ) = ( x k ? x ~ k ) ? β k ( F ( x k ) ? F ( x ~ k ) ) d(x^k,\widetilde{x}^k,\beta ^k)=(x^k-\widetilde{x}^k)-\beta ^k(F(x^k)-F(\widetilde{x}^k)) d(xk,x k,βk)=(xk?x k)?βk(F(xk)?F(x k)),并且计算 α k ? = ( x k ? x ~ k ) T d ( x k , x ~ k , β k ) ∥ d ( x k , x ~ k , β k ) ∥ 2 \alpha^*_k=\frac{(x^k-\widetilde{x}^k)^Td(x^k,\widetilde{x}^k,\beta ^k)}{\|d(x^k,\widetilde{x}^k,\beta ^k)\|^2} αk??=d(xk,x k,βk)2(xk?x k)Td(xk,x k,βk)?,设置 x k + 1 = x k ? γ α k ? d ( x k , x ~ k , β k ) x^{k+1}=x^k-\gamma\alpha^*_kd(x^k,\widetilde{x}^k,\beta ^k) xk+1=xk?γαk??d(xk,x k,βk)

P G A b 2 PGA_{b2} PGAb2?

F ( x ) = A x + c F(x)=Ax+c F(x)=Ax+c
A对称半正定, G = I ? β M G=I-\beta M G=I?βM,G对称正定
x 0 ∈ R n x^0 \in R^n x0Rn 0 < β < 1 λ m a x ( A ) 0<\beta <\frac{1}{\lambda_{max}(A)} 0<β<λmax?(A)1?,并且 γ ∈ ( 0 , 2 ) \gamma \in (0,2) γ(0,2)
预测器: x ~ k = P r o x β θ ( x k ? β F ( x k ) ) \textbf{预测器:}\widetilde{x}^k=Prox_{\beta\theta}(x^k-\beta F(x^k)) 预测器:x k=Proxβθ?(xk?βF(xk));
校正器: \textbf{校正器:} 校正器: d ( x k , x ~ k ) = x k ? x ~ k d(x^k,\widetilde{x}^k)=x^k-\widetilde{x}^k d(xk,x k)=xk?x k,设置 x k + 1 = x k ? γ d ( x k , x ~ k ) x^{k+1}=x^k-\gamma d(x^k,\widetilde{x}^k) xk+1=xk?γd(xk,x k)

交替方向乘子法(ADMM,alternating direction method of multipliers)

m i n x , y , z θ 1 ( x ) + θ 2 ( x ) + θ 3 ( x ) min_{x,y,z}\theta_1(x)+\theta_2(x)+\theta_3(x) minx,y,z?θ1?(x)+θ2?(x)+θ3?(x)
s . t . f ( x , y , z ) = 0 s.t. f(x,y,z)=0 s.t.f(x,y,z)=0
拉格朗日函数 L ( x , y , z , λ ) = θ 1 ( x ) + θ 2 ( x ) + θ 3 ( x ) + λ f ( x , y , z ) L(x,y,z,\lambda)=\theta_1(x)+\theta_2(x)+\theta_3(x)+\lambda f(x,y,z) L(x,y,z,λ)=θ1?(x)+θ2?(x)+θ3?(x)+λf(x,y,z)
λ \lambda λ为拉格朗日乘子
增广拉格朗日函数 L ( x , y , z , λ ) = θ 1 ( x ) + θ 2 ( x ) + θ 3 ( x ) + λ f ( x , y , z ) + ρ 2 ∥ f ( x , y , z ) ∥ 2 2 L(x,y,z,\lambda)=\theta_1(x)+\theta_2(x)+\theta_3(x)+\lambda f(x,y,z)+\frac{\rho}{2} \|f(x,y,z)\|_2^2 L(x,y,z,λ)=θ1?(x)+θ2?(x)+θ3?(x)+λf(x,y,z)+2ρ?f(x,y,z)22?
λ \lambda λ为拉格朗日乘子, ρ \rho ρ罚因子

ADMM

m i n x , y , z θ 1 ( x ) + θ 2 ( x ) min_{x,y,z}\theta_1(x)+\theta_2(x) minx,y,z?θ1?(x)+θ2?(x)
s . t . A x + B y = c s.t. Ax+By=c s.t.Ax+By=c
初始化: x 0 ∈ R n \textbf{初始化:}x^0 \in R^n 初始化:x0Rn y 0 ∈ R q y^0 \in R^q y0Rq λ 0 ∈ R m \lambda^0 \in R^m λ0Rm,并且 ρ > 0. \rho>0. ρ>0.
一般步骤: \textbf{一般步骤:} 一般步骤: k = 0 , 1 , k=0,1, k=0,1,…执行以下步骤:
( a ) x k + 1 ∈ arg ? min ? { θ 1 ( x ) + ρ 2 ∥ A x + B y k ? c + 1 ρ λ k ∥ 2 : x ∈ R n } ; (a)x^{k+1}\in \mathop{\arg\min}\{\theta_1(x)+\frac{\rho}{2}\|Ax+By^k-c+\frac{1}{\rho}\lambda^k\|^2:x\in R^n\}; (a)xk+1argmin{θ1?(x)+2ρ?Ax+Byk?c+ρ1?λk2:xRn};
( b ) y k + 1 ∈ arg ? min ? { θ 2 ( y ) + ρ 2 ∥ A x k + 1 + B y ? c + 1 ρ λ k ∥ 2 : x ∈ R q } ; (b)y^{k+1}\in \mathop{\arg\min}\{\theta_2(y)+\frac{\rho}{2}\|Ax^{k+1}+By-c+\frac{1}{\rho}\lambda^k\|^2:x\in R^q\}; (b)yk+1argmin{θ2?(y)+2ρ?Axk+1+By?c+ρ1?λk2:xRq};
( c ) λ k + 1 = λ k + ρ ( A x k + 1 + B y k + 1 ? c ) . (c)\lambda^{k+1}=\lambda^k+\rho(Ax^{k+1}+By^{k+1}-c). (c)λk+1=λk+ρ(Axk+1+Byk+1?c).

交替方向线性近似乘子法(AD-LPMM,linearized proximal)

交替方向近似乘子法(AD-PMM,proximal)的一个特例
初始化: x 0 ∈ R n \textbf{初始化:}x^0 \in R^n 初始化:x0Rn y 0 ∈ R q y^0 \in R^q y0Rq λ 0 ∈ R m \lambda^0 \in R^m λ0Rm
ρ > 0 \rho>0 ρ>0 α ≥ ρ λ m a x ( A T A ) \alpha\ge\rho\lambda_{max}(A^TA) αρλmax?(ATA) β ≥ ρ λ m a x ( B T B ) . \beta\ge\rho\lambda_{max}(B^TB). βρλmax?(BTB).
一般步骤: \textbf{一般步骤:} 一般步骤: k = 0 , 1 , k=0,1, k=0,1,…执行以下步骤:
( a ) x k + 1 = P r o x 1 α θ 1 [ x k + ρ α A T ( A x k + B y k ? c + 1 ρ λ k ) ] ; (a)x^{k+1}=Prox_{\frac{1}{\alpha}\theta_1}[x^k+\frac{\rho}{\alpha}A^T(Ax^k+By^k-c+\frac{1}{\rho}\lambda^k)]; (a)xk+1=Proxα1?θ1??[xk+αρ?AT(Axk+Byk?c+ρ1?λk)];
( b ) y k + 1 = P r o x 1 β θ 2 [ y k + ρ β B T ( A x k + 1 + B y k ? c + 1 ρ λ k ) ] ; (b)y^{k+1}=Prox_{\frac{1}{\beta}\theta_2}[y^k+\frac{\rho}{\beta}B^T(Ax^{k+1}+By^k-c+\frac{1}{\rho}\lambda^k)]; (b)yk+1=Proxβ1?θ2??[yk+βρ?BT(Axk+1+Byk?c+ρ1?λk)];
( c ) λ k + 1 = λ k + ρ ( A x k + 1 + B y k + 1 ? c ) . (c)\lambda^{k+1}=\lambda^k+\rho(Ax^{k+1}+By^{k+1}-c). (c)λk+1=λk+ρ(Axk+1+Byk+1?c).

文章来源:https://blog.csdn.net/m0_58786865/article/details/135413317
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。