1. 线性最小二乘法
m o d e l : y c ( x ) = A x model: \boldsymbol{y}_{c}(\boldsymbol{x})=\boldsymbol{A}\boldsymbol{x} model:yc(x)=Ax
o b j e c t i v e f u n c t i o n : f ( x ) = 1 2 ∣ ∣ y s − y c ( x ) ∣ ∣ 2 = 1 2 ∣ ∣ y s − A x ∣ ∣ 2 objective function: f(\boldsymbol{x})=\frac{1}{2}||\boldsymbol{y}_{s}-\boldsymbol{y}_{c}(\boldsymbol{x})||^{2}=\frac{1}{2}||\boldsymbol{y}_{s}-\boldsymbol{A}\boldsymbol{x}||^{2} objectivefunction:f(x)=21∣∣ys−yc(x)∣∣2=21∣∣ys−Ax∣∣2
f ′ ( x ) = ( − A T ) ( y s − A x ) = 0 f^{\prime}(\boldsymbol{x})=(-\boldsymbol{A}^{T})(\boldsymbol{y}_{s}-\boldsymbol{A}\boldsymbol{x})=\textbf{0} f′(x)=(−AT)(ys−Ax)=0
x = ( A T A ) − 1 A T y s \boldsymbol{x}=(\boldsymbol{A}^{T}\boldsymbol{A})^{-1}\boldsymbol{A}^{T}\boldsymbol{y}_{s} x=(ATA)−1ATys
不需要迭代计算
2. 非线性优化(梯度下降法)
m o d e l : y c ( x ) = g ( x ) , ∂ y c ( x ) ∂ x = J ( x ) model: \boldsymbol{y}_{c}(\boldsymbol{x})=g(\boldsymbol{x}), \frac{\partial \boldsymbol{y}_{c}(\boldsymbol{x})}{\partial \boldsymbol{x}}=\boldsymbol{J(\boldsymbol{x})} model:yc(x)=g(x),∂x∂yc(x)=J(x)
o b j e c t i v e f u n c t i o n : f ( x ) = 1 2 ∣ ∣ y s − y c ( x ) ∣ ∣ 2 = 1 2 ∣ ∣ y s − g ( x ) ∣ ∣ 2 objective function: f(\boldsymbol{x})=\frac{1}{2}||\boldsymbol{y}_{s}-\boldsymbol{y}_{c}(\boldsymbol{x})||^{2}=\frac{1}{2}||\boldsymbol{y}_{s}-g(\boldsymbol{x})||^{2} objectivefunction:f(x)=21∣∣ys−yc(x)∣∣2=21∣∣ys−g(x)∣∣2
d = f ′ ( x ) = − g ′ ( x ) ( y s − g ( x ) ) = − g ′ ( x ) e \boldsymbol{d}=f^{\prime}(\boldsymbol{x})=-g^{\prime}(\boldsymbol{x})(\boldsymbol{y}_{s}-g(\boldsymbol{x}))=-g^{\prime}(\boldsymbol{x})e d=f′(x)=−g′(x)(ys−g(x))=−g′(x)e
x ( t + 1 ) = x ( t ) + α ⋅ d \boldsymbol{x}_{(t+1)}=\boldsymbol{x}_{(t)}+\alpha\cdot\boldsymbol{d} x(t+1)=x(t)+α⋅d
3. 非线性优化(牛顿法)
m o d e l : y c ( x ) = g ( x ) , ∂ y c ( x ) ∂ x = J ( x ) model: \boldsymbol{y}_{c}(\boldsymbol{x})=g(\boldsymbol{x}), \frac{\partial \boldsymbol{y}_{c}(\boldsymbol{x})}{\partial \boldsymbol{x}}=\boldsymbol{J(\boldsymbol{x})} model:yc(x)=g(x),∂x∂yc(x)=J(x)
m o d e l 泰 勒 展 开 : y c ( x ( t + 1 ) ) = y c ( x ( t ) + δ ) = y c ( x ( t ) ) + J ( x ( t ) ) δ model 泰勒展开: \boldsymbol{y}_{c}(\boldsymbol{x}_{(t+1)})=\boldsymbol{y}_{c}(\boldsymbol{x}_{(t)}+\boldsymbol{\delta})=\boldsymbol{y}_{c}(\boldsymbol{x}_{(t)})+\boldsymbol{J(\boldsymbol{x}_{(t)})}\boldsymbol{\delta} model泰勒展开:yc(x(t+1))=yc(x(t)+δ)=yc(x(t))+J(x(t))δ
o b j e c t i v e f u n c t i o n : f ( x ( t + 1 ) ) = 1 2 ∣ ∣ y s ( t + 1 ) − y c ( x ( t + 1 ) ) ∣ ∣ 2 = 1 2 ∣ ∣ y s ( t + 1 ) − y c ( x ( t ) ) − J ( x ( t ) ) δ ∣ ∣ 2 = 1 2 ∣ ∣ e − J ( x ( t ) ) δ ∣ ∣ 2 objective function: f(\boldsymbol{x}_{(t+1)})=\frac{1}{2}||\boldsymbol{y}_{s(t+1)}-\boldsymbol{y}_{c}(\boldsymbol{x}_{(t+1)})||^{2}=\frac{1}{2}||\boldsymbol{y}_{s(t+1)}-\boldsymbol{y}_{c}(\boldsymbol{x}_{(t)})-\boldsymbol{J(\boldsymbol{x}_{(t)})}\boldsymbol{\delta}||^{2}=\frac{1}{2}||\boldsymbol{e}-\boldsymbol{J(\boldsymbol{x}_{(t)})}\boldsymbol{\delta}||^{2} objectivefunction:f(x(t+1))=21∣∣ys(t+1)−yc(x(t+1))∣∣2=21∣∣ys(t+1)−yc(x(t))−J(x(t))δ∣∣2=21∣∣e−J(x(t))δ∣∣2
f ( x ) ′ ∣ δ = ( − J ( x ) T ) ( e − J ( x ) δ ) = 0 f(\boldsymbol{x})^{\prime}|_{\boldsymbol{\delta}}=(-\boldsymbol{J(\boldsymbol{x})}^{T})(\boldsymbol{e}-\boldsymbol{J(\boldsymbol{x})}\boldsymbol{\delta})=\textbf{0} f(x)′∣δ=(−J(x)T)(e−J(x)δ)=0
δ = ( J ( x ) T J ( x ) ) − 1 J ( x ) T e \boldsymbol{\delta}=(\boldsymbol{J(\boldsymbol{x})}^{T}\boldsymbol{J(\boldsymbol{x})})^{-1}\boldsymbol{J(\boldsymbol{x})}^{T}\boldsymbol{e} δ=(J(x)TJ(x))−1J(x)Te
x ( t + 1 ) = x ( t ) + δ \boldsymbol{x}_{(t+1)}=\boldsymbol{x}_{(t)}+\boldsymbol{\delta} x(t+1)=x(t)+δ
4. 非线性优化(LM)
将上式 δ = ( J ( x ) T J ( x ) ) − 1 J ( x ) T e \boldsymbol{\delta}=(\boldsymbol{J(\boldsymbol{x})}^{T}\boldsymbol{J(\boldsymbol{x})})^{-1}\boldsymbol{J(\boldsymbol{x})}^{T}\boldsymbol{e} δ=(J(x)TJ(x))−1J(x)Te 中的 J ( x ) T J ( x ) \boldsymbol{J(\boldsymbol{x})}^{T}\boldsymbol{J(\boldsymbol{x})} J(x)TJ(x) 换成 J ( x ) T J ( x ) + λ I \boldsymbol{J(\boldsymbol{x})}^{T}\boldsymbol{J(\boldsymbol{x})}+\lambda\boldsymbol{I} J(x)TJ(x)+λI,选择足够大的 λ \lambda λ 来确保 J ( x ) T J ( x ) + λ I \boldsymbol{J(\boldsymbol{x})}^{T}\boldsymbol{J(\boldsymbol{x})}+\lambda\boldsymbol{I} J(x)TJ(x)+λI 始终为正定矩阵,可逆。当 λ \lambda λ 减小,LM近似牛顿法,当 λ \lambda λ 增大,LM近似梯度下降法。
5. 有约束非线性优化(LM)
罚函数法的思想是借助惩罚函数将约束问题转化为无约束问题进行求解。
形如以下优化问题:
m i n f ( x ) s . t . h i ( x ) = 0 , i = 1 , 2 , ⋯ , n . g j ( x ) ≥ 0 , j = 1 , 2 , ⋯ , m . \begin{aligned} min \ & f(x) \\ s.t.\ & h_{i}(x) = 0, i=1,2,\cdots, n. \\ & g_{j}(x) \geq 0, j=1,2,\cdots, m.\end{aligned} min s.t. f(x)hi(x)=0,i=1,2,⋯,n.gj(x)≥0,j=1,2,⋯,m.
构造带罚项目标函数:
m i n F ( x ) = f ( x ) + M ( ∑ i = 1 n [ h i ( x ) ] 2 + ∑ j = 1 m [ m i n ( 0 , g j ( x ) ) ] 2 ) min \ F(x)=f(x)+M\left(\sum_{i=1}^{n}[h_{i}(x)]^{2}+\sum_{j=1}^{m}[min(0,g_{j}(x))]^{2}\right) min F(x)=f(x)+M(i=1∑n[hi(x)]2+j=1∑m[min(0,gj(x))]2)