EM算法摘记(四):例子

一般参数估计问题的EM算法

\qquad Y \mathbf Y 观测数据 Z \mathbf Z 隐藏变量或者丢失数据 θ \theta 为模型的参数 p ( Y , Z θ ) p(\mathbf Y,\mathbf Z|\theta) 为完整数据 ( Y , Z ) (\mathbf Y,\mathbf Z) 的最大似然函数,一般的参数估计问题的 E M EM 算法为:

  • E E 步:  计算 Q ( θ , θ ( i ) ) = E p ( Z Y , θ ) [ ln p ( Y , Z θ ) Y , θ ( i ) ] Q(\theta,\theta^{(i)})=E_{p(\mathbf Z|\mathbf Y,\theta)}\left[\ln p(\mathbf Y,\mathbf Z|\theta)|\mathbf Y,\theta^{(i)}\right]
  • M M 步: 求 θ ( i + 1 ) = arg max θ Q ( θ , θ ( i ) ) \theta^{(i+1)}=\argmax_{\theta} Q(\theta,\theta^{(i)})

\qquad 适合采用EM算法的场合:(1)目的是为了估计某个概率模型的参数;(2)观测数据集不完整,存在丢失的数据,或者存在明确的隐藏变量;(3)完整数据的对数似然函数的最大似然解比较方便进行求解。

《模式分类》3.9节例2推导过程

\qquad 考虑一个服从某个特定分布的样本集 D = { x 1 , x 2 , , x N } \mathcal D=\{ \boldsymbol{x}_{1},\boldsymbol{x}_{2},\cdots,\boldsymbol{x}_{N} \} ,假设样本点的一些特征丢失,也就是 x k = ( x k g , x k b ) \boldsymbol{x}_{k}=(\boldsymbol{x}_{kg},\boldsymbol{x}_{kb}) ,一部分特征 x k g \boldsymbol{x}_{kg} 完整的(good),另一部分特征 x k b \boldsymbol{x}_{kb} 丢失了(bad)

  • 假设数据集为 D = { x 1 , x 2 , x 3 , x 4 } = { ( 0 2 ) , ( 1 0 ) , ( 2 2 ) , ( 4 ) } \mathcal D=\{ \boldsymbol{x}_{1},\boldsymbol{x}_{2},\boldsymbol{x}_{3},\boldsymbol{x}_{4} \}=\left\{ \left( \begin{matrix} 0 \\ 2 \end{matrix}\right), \left( \begin{matrix} 1 \\ 0 \end{matrix}\right), \left( \begin{matrix} 2 \\ 2 \end{matrix}\right), \left( \begin{matrix} * \\ 4 \end{matrix}\right) \right\} ,其中 * 表示 x 4 \boldsymbol{x}_{4} 的第一个分量 x 41 x_{41} 未知,即 Y = { x 1 , x 2 , x 3 , x 42 } \mathbf Y=\{\boldsymbol{x}_{1},\boldsymbol{x}_{2},\boldsymbol{x}_{3},x_{42}\} Z = x 41 \mathbf Z=x_{41}
  • 假设数据集服从二维高斯分布,协方差矩阵为单位阵 Σ = [ σ 1 2 0 0 σ 2 2 ] \Sigma=\left [ \begin{matrix} \sigma_{1}^{2} & 0 \\ 0 & \sigma_{2}^{2}\end{matrix}\right] ,那么二维高斯分布的参数 θ = [ μ 1 , μ 2 , σ 1 2 , σ 2 2 ] T \theta=[\mu_{1}, \mu_{2}, \sigma_{1}^{2}, \sigma_{2}^{2}]^{T} ,即:
    p ( x θ ) = N ( x μ , Σ ) = 1 2 π Σ 1 2 e 1 2 ( x μ ) T Σ 1 ( x μ ) = 1 2 π σ 1 σ 2 e 1 2 ( x μ ) T Σ 1 ( x μ ) \begin{aligned} p(\boldsymbol{x}|\theta)&=\mathcal{N}(\boldsymbol{x}|\boldsymbol{\mu},\boldsymbol{\Sigma})\\ &= \dfrac{1}{2\pi{|\boldsymbol\Sigma|}^{-\frac{1}{2}}}e^{-\frac{1}{2}(\boldsymbol{x}-\boldsymbol{\mu})^{T}\boldsymbol\Sigma^{-1}(\boldsymbol{x}-\boldsymbol{\mu})}\\ &= \dfrac{1}{2\pi\sigma_{1}\sigma_{2}}e^{-\frac{1}{2}(\boldsymbol{x}-\boldsymbol{\mu})^{T}\boldsymbol\Sigma^{-1}(\boldsymbol{x}-\boldsymbol{\mu})}\\ \end{aligned}
    \qquad 数据集 D \mathcal D 似然函数为: k = 1 N p ( x k θ ) \prod_{k=1}^{N} p(\boldsymbol{x}_{k}|\theta)
    \qquad 数据集 D \mathcal D 对数似然函数为: k = 1 4 ln p ( x k θ ) \sum\limits_{k=1}^{4}\ln p(\boldsymbol{x}_{k}|\theta)

\qquad 假设参数的初始值为 θ ( i ) \theta^{(i)} ,显然 p ( Z Y , θ ) = p ( x 41 x 1 , x 2 , x 3 , x 42 , θ ) = p ( x 41 x 42 = 4 , θ ) p(\mathbf Z|\mathbf Y,\theta)=p(x_{41}|\boldsymbol{x}_{1},\boldsymbol{x}_{2},\boldsymbol{x}_{3},x_{42},\theta)=p(x_{41}|x_{42}=4,\theta) ,那么 Q ( θ , θ ( i ) ) Q(\theta,\theta^{(i)}) 就可以写为:

Q ( θ , θ ( i ) ) = E p ( x 41 x 42 = 4 , θ ) [ k = 1 4 ln p ( x k θ ) ] = E p ( x 41 x 42 = 4 , θ ) [ k = 1 3 ln p ( x k θ ) + ln p ( x 4 θ ) ] = k = 1 3 ln p ( x k θ ) + E p ( x 41 x 42 = 4 , θ ) [ ln p ( x 4 θ ) ] = k = 1 3 ln p ( x k θ ) + + ln p ( x 4 θ ) p ( x 41 x 42 = 4 , θ ( i ) ) d x 41 = k = 1 3 ln p ( x k θ ) + + ln p ( x 4 θ ) p ( x 41 , x 42 = 4 θ ( i ) ) p ( x 42 = 4 θ ( i ) ) d x 41 = k = 1 3 ln p ( x k θ ) + + ln p ( x 4 θ ) p ( x 41 , x 42 = 4 θ ( i ) ) + p ( x 41 , x 42 = 4 θ ( i ) ) d x 41 d x 41 \begin{aligned}Q(\theta,\theta^{(i)})&=E_{p(x_{41}|x_{42}=4,\theta)}\left[\sum_{k=1}^{4} \ln p(\boldsymbol{x}_{k}|\theta)\right]\\ &=E_{p(x_{41}|x_{42}=4,\theta)}\left[\sum_{k=1}^{3} \ln p(\boldsymbol{x}_{k}|\theta)+\ln p(\boldsymbol{x}_{4}|\theta)\right]\\ &=\sum_{k=1}^{3} \ln p(\boldsymbol{x}_{k}|\theta)+E_{p(x_{41}|x_{42}=4,\theta)}\left[\ln p(\boldsymbol{x}_{4}|\theta)\right]\\ &=\sum_{k=1}^{3} \ln p(\boldsymbol{x}_{k}|\theta)+\int_{-\infty}^{+\infty}\ln p(\boldsymbol{x}_{4}|\theta)p(x_{41}|x_{42}=4,\theta^{(i)})dx_{41}\\ &=\sum_{k=1}^{3} \ln p(\boldsymbol{x}_{k}|\theta)+\int_{-\infty}^{+\infty}\ln p(\boldsymbol{x}_{4}|\theta)\frac{p(x_{41},x_{42}=4|\theta^{(i)})}{p(x_{42}=4|\theta^{(i)})}dx_{41}\\ &=\sum_{k=1}^{3} \ln p(\boldsymbol{x}_{k}|\theta)+\int_{-\infty}^{+\infty}\ln p(\boldsymbol{x}_{4}|\theta)\frac{p(x_{41},x_{42}=4|\theta^{(i)})}{\int_{-\infty}^{+\infty}p(x_{41}^{'},x_{42}=4|\theta^{(i)})dx_{41}^{'}}dx_{41}\\ \end{aligned}

\qquad 其中,边缘概率 + p ( x 41 , x 42 = 4 θ ( i ) ) d x 41 ( K ) \int_{-\infty}^{+\infty}p(x_{41}^{'},x_{42}=4|\theta^{(i)})dx_{41}^{'}(\equiv K) 是常数(假设为值为 K K

\qquad 又由 ln p ( x 4 θ ) = ln 1 2 π σ 1 σ 2 e 1 2 ( x μ ) T Σ 1 ( x μ ) = ln ( 2 π σ 1 σ 2 ) ( x 41 μ 1 ) 2 2 σ 1 2 ( x 42 μ 2 ) 2 2 σ 2 2 = ln ( 2 π σ 1 σ 2 ) ( x 41 μ 1 ) 2 2 σ 1 2 ( 4 μ 2 ) 2 2 σ 2 2 \begin{aligned}\ln p(\boldsymbol{x}_{4}|\theta)&=\ln \dfrac{1}{2\pi\sigma_{1}\sigma_{2}}e^{-\frac{1}{2}(\boldsymbol{x}-\boldsymbol{\mu})^{T}\boldsymbol\Sigma^{-1}(\boldsymbol{x}-\boldsymbol{\mu})}\\ &=-\ln(2\pi\sigma_{1}\sigma_{2})-\frac{(x_{41}-\mu_{1})^{2}}{2\sigma_{1}^{2}}-\frac{(x_{42}-\mu_{2})^{2}}{2\sigma_{2}^{2}}\\ &=-\ln(2\pi\sigma_{1}\sigma_{2})-\frac{(x_{41}-\mu_{1})^{2}}{2\sigma_{1}^{2}}-\frac{(4-\mu_{2})^{2}}{2\sigma_{2}^{2}}\\ \end{aligned}

\qquad 本文和书上一样,只计算第一步的迭代过程

\qquad i = 0 i=0 时, θ ( 0 ) = [ 0 , 0 , 1 , 1 ] T \theta^{(0)}=[0,0,1,1]^{T} ,因此 p ( x 41 , x 42 = 4 θ ( 0 ) ) = 1 2 π σ 1 σ 2 e 1 2 ( x μ ) T Σ 1 ( x μ ) = 1 2 π e x 41 2 + 4 2 2 \begin{aligned}p(x_{41},x_{42}=4|\theta^{(0)})&= \dfrac{1}{2\pi\sigma_{1}\sigma_{2}}e^{-\frac{1}{2}(\boldsymbol{x}-\boldsymbol{\mu})^{T}\boldsymbol\Sigma^{-1}(\boldsymbol{x}-\boldsymbol{\mu})}\\ &=\dfrac{1}{2\pi}e^{-\frac{x_{41}^{2}+4^{2}}{2}}\\ \end{aligned}
\qquad 因此
Q ( θ , θ ( 0 ) ) = k = 1 3 ln p ( x k θ ) + + ln p ( x 4 θ ) p ( x 41 , x 42 = 4 θ ( 0 ) ) + p ( x 41 , x 42 = 4 θ ( 0 ) ) d x 41 d x 41 = k = 1 3 ln p ( x k θ ) + 1 K + ln p ( x 4 θ ) p ( x 41 , x 42 = 4 θ ( 0 ) ) d x 41 = k = 1 3 ln p ( x k θ ) + 1 K + ln p ( x 4 θ ) 1 2 π e x 41 2 + 4 2 2 d x 41 = k = 1 3 ln p ( x k θ ) + 1 K + [ ln ( 2 π σ 1 σ 2 ) ( x 41 μ 1 ) 2 2 σ 1 2 ( 4 μ 2 ) 2 2 σ 2 2 ] 1 2 π e x 41 2 + 4 2 2 d x 41 \begin{aligned}Q(\theta,\theta^{(0)}) &=\sum_{k=1}^{3} \ln p(\boldsymbol{x}_{k}|\theta)+\int_{-\infty}^{+\infty}\ln p(\boldsymbol{x}_{4}|\theta)\frac{p(x_{41},x_{42}=4|\theta^{(0)})}{\int_{-\infty}^{+\infty}p(x_{41}^{'},x_{42}=4|\theta^{(0)})dx_{41}^{'}}dx_{41}\\ &=\sum_{k=1}^{3} \ln p(\boldsymbol{x}_{k}|\theta)+\frac{1}{K}\int_{-\infty}^{+\infty}\ln p(\boldsymbol{x}_{4}|\theta)p(x_{41},x_{42}=4|\theta^{(0)})dx_{41}\\ &=\sum_{k=1}^{3} \ln p(\boldsymbol{x}_{k}|\theta)+\frac{1}{K}\int_{-\infty}^{+\infty}\ln p(\boldsymbol{x}_{4}|\theta)\dfrac{1}{2\pi}e^{-\frac{x_{41}^{2}+4^{2}}{2}}dx_{41}\\ &=\sum_{k=1}^{3} \ln p(\boldsymbol{x}_{k}|\theta)+\frac{1}{K}\int_{-\infty}^{+\infty}\left[-\ln(2\pi\sigma_{1}\sigma_{2})-\frac{(x_{41}-\mu_{1})^{2}}{2\sigma_{1}^{2}}-\frac{(4-\mu_{2})^{2}}{2\sigma_{2}^{2}}\right]\dfrac{1}{2\pi}e^{-\frac{x_{41}^{2}+4^{2}}{2}}dx_{41}\\ \end{aligned}

\qquad 这里,由于 ln ( 2 π σ 1 σ 2 ) \ln(2\pi\sigma_{1}\sigma_{2}) x 41 x_{41} 无关,可得
1 K + ln ( 2 π σ 1 σ 2 ) p ( x 41 , x 42 4 θ ( 0 ) ) d x 41 = ln ( 2 π σ 1 σ 2 ) + p ( x 41 , x 42 = 4 θ ( 0 ) ) + p ( x 41 , x 42 = 4 θ ( 0 ) ) d x 41 d x 41 = ln ( 2 π σ 1 σ 2 ) + p ( x 41 , x 42 = 4 θ ( 0 ) ) d x 41 + p ( x 41 , x 42 = 4 θ ( 0 ) ) d x 41 = ln ( 2 π σ 1 σ 2 ) \begin{aligned}\frac{1}{K}\int_{-\infty}^{+\infty}\ln(2\pi\sigma_{1}\sigma_{2})p(x_{41},x_{42} 4|\theta^{(0)})dx_{41} &=\ln(2\pi\sigma_{1}\sigma_{2})\int_{-\infty}^{+\infty}\frac{p(x_{41},x_{42}=4|\theta^{(0)})}{\int_{-\infty}^{+\infty}p(x_{41}^{'},x_{42}=4|\theta^{(0)})dx_{41}^{'}}dx_{41} \\ &=\ln(2\pi\sigma_{1}\sigma_{2})\frac{\int_{-\infty}^{+\infty}p(x_{41},x_{42}=4|\theta^{(0)})dx_{41} }{\int_{-\infty}^{+\infty}p(x_{41}^{'},x_{42}=4|\theta^{(0)})dx_{41}^{'}}\\ &=\ln(2\pi\sigma_{1}\sigma_{2}) \end{aligned}

\qquad 同样, ( 4 μ 2 ) 2 2 σ 2 2 \dfrac{(4-\mu_{2})^{2}}{2\sigma_{2}^{2}} 也与 x 41 x_{41} 无关,也有
1 K + ( 4 μ 2 ) 2 2 σ 2 2 p ( x 41 , x 42 4 θ ( 0 ) ) d x 41 = ( 4 μ 2 ) 2 2 σ 2 2 \begin{aligned}\frac{1}{K}\int_{-\infty}^{+\infty}\dfrac{(4-\mu_{2})^{2}}{2\sigma_{2}^{2}}p(x_{41},x_{42} 4|\theta^{(0)})dx_{41} &=\dfrac{(4-\mu_{2})^{2}}{2\sigma_{2}^{2}} \end{aligned}

\qquad 因此
Q ( θ , θ ( 0 ) ) = k = 1 3 [ ln p ( x k θ ) ] ln ( 2 π σ 1 σ 2 ) ( 4 μ 2 ) 2 2 σ 2 2 + 1 K + ( x 41 μ 1 ) 2 2 σ 1 2 1 2 π e x 41 2 + 4 2 2 d x 41 \begin{aligned}Q(\theta,\theta^{(0)}) &=\sum_{k=1}^{3} [\ln p(\boldsymbol{x}_{k}|\theta)]-\ln(2\pi\sigma_{1}\sigma_{2})-\frac{(4-\mu_{2})^{2}}{2\sigma_{2}^{2}}\\ &+\frac{1}{K}\int_{-\infty}^{+\infty}-\frac{(x_{41}-\mu_{1})^{2}}{2\sigma_{1}^{2}}\cdot\dfrac{1}{2\pi}e^{-\frac{x_{41}^{2}+4^{2}}{2}}dx_{41}\\ \end{aligned}

\qquad 此处,还要求出 1 K + ( x 41 μ 1 ) 2 2 σ 1 2 1 2 π e x 41 2 + 4 2 2 d x 41 \dfrac{1}{K}\int_{-\infty}^{+\infty}-\dfrac{(x_{41}-\mu_{1})^{2}}{2\sigma_{1}^{2}}\cdot\dfrac{1}{2\pi}e^{-\frac{x_{41}^{2}+4^{2}}{2}}dx_{41} ,由于 μ 1 2 \mu_{1}^{2} x 41 x_{41} 无关

1 K + ( x 41 μ 1 ) 2 2 σ 1 2 1 2 π e x 41 2 + 4 2 2 d x 41 = 1 2 σ 1 2 1 K + ( x 41 2 2 μ 1 x 41 + μ 1 2 ) 1 2 π e x 41 2 + 4 2 2 d x 41 = μ 1 2 2 σ 1 2 1 2 σ 1 2 1 K + ( x 41 2 2 μ 1 x 41 ) 1 2 π e x 41 2 + 4 2 2 d x 41 \begin{aligned}\frac{1}{K}\int_{-\infty}^{+\infty}-\frac{(x_{41}-\mu_{1})^{2}}{2\sigma_{1}^{2}}\cdot\dfrac{1}{2\pi}e^{-\frac{x_{41}^{2}+4^{2}}{2}}dx_{41}&=-\frac{1}{2\sigma_{1}^{2}}\frac{1}{K}\int_{-\infty}^{+\infty}(x_{41}^{2}-2\mu_{1}x_{41}+\mu_{1}^{2})\cdot\dfrac{1}{2\pi}e^{-\frac{x_{41}^{2}+4^{2}}{2}}dx_{41}\\ &=-\frac{\mu_{1}^{2}}{2\sigma_{1}^{2}}-\frac{1}{2\sigma_{1}^{2}}\frac{1}{K}\int_{-\infty}^{+\infty}(x_{41}^{2}-2\mu_{1}x_{41})\cdot\dfrac{1}{2\pi}e^{-\frac{x_{41}^{2}+4^{2}}{2}}dx_{41}\\ \end{aligned}
\qquad
1 K + ( x 41 2 2 μ 1 x 41 ) 1 2 π e x 41 2 + 4 2 2 d x 41 = 1 K + x 41 ( x 41 2 μ 1 ) 1 2 π e x 41 2 + 4 2 2 d x 41 = 1 K + ( x 41 2 μ 1 ) 1 2 π e x 41 2 + 4 2 2 d ( x 41 2 + 4 2 2 ) = 1 K + ( x 41 2 μ 1 ) 1 2 π d ( e x 41 2 + 4 2 2 ) = [ 1 K ( x 41 2 μ 1 ) 1 2 π e x 41 2 + 4 2 2 ] + + 1 K + 1 2 π e x 41 2 + 4 2 2 d x 41 = 1 \begin{aligned}\frac{1}{K}\int_{-\infty}^{+\infty}(x_{41}^{2}-2\mu_{1}x_{41})\cdot\dfrac{1}{2\pi}e^{-\frac{x_{41}^{2}+4^{2}}{2}}dx_{41} &=\frac{1}{K}\int_{-\infty}^{+\infty}x_{41}(x_{41}-2\mu_{1})\cdot\dfrac{1}{2\pi}e^{-\frac{x_{41}^{2}+4^{2}}{2}}dx_{41}\\ &=\frac{1}{K}\int_{-\infty}^{+\infty}(x_{41}-2\mu_{1})\cdot\dfrac{1}{2\pi}e^{-\frac{x_{41}^{2}+4^{2}}{2}}d( \frac{x_{41}^{2}+4^{2}}{2})\\ &=-\frac{1}{K}\int_{-\infty}^{+\infty}(x_{41}-2\mu_{1})\cdot\dfrac{1}{2\pi}d(e^{-\frac{x_{41}^{2}+4^{2}}{2}})\\ &=\left[-\frac{1}{K}(x_{41}-2\mu_{1})\cdot\dfrac{1}{2\pi}e^{-\frac{x_{41}^{2}+4^{2}}{2}}\right] \bigg|_{-\infty}^{+\infty}+\frac{1}{K}\int_{-\infty}^{+\infty}\dfrac{1}{2\pi}e^{-\frac{x_{41}^{2}+4^{2}}{2}}dx_{41}\\ &=1 \end{aligned}
\qquad 因此
1 K + ( x 41 μ 1 ) 2 2 σ 1 2 1 2 π e x 41 2 + 4 2 2 d x 41 = μ 1 2 2 σ 1 2 1 2 σ 1 2 = 1 + μ 1 2 2 σ 1 2 \begin{aligned}\frac{1}{K}\int_{-\infty}^{+\infty}-\frac{(x_{41}-\mu_{1})^{2}}{2\sigma_{1}^{2}}\cdot\dfrac{1}{2\pi}e^{-\frac{x_{41}^{2}+4^{2}}{2}}dx_{41} &=-\frac{\mu_{1}^{2}}{2\sigma_{1}^{2}}-\frac{1}{2\sigma_{1}^{2}}=-\frac{1+\mu_{1}^{2}}{2\sigma_{1}^{2}}\\ \end{aligned}

\qquad 可得到
Q ( θ , θ ( 0 ) ) = k = 1 3 [ ln p ( x k θ ) ] ln ( 2 π σ 1 σ 2 ) 1 + μ 1 2 2 σ 1 2 ( 4 μ 2 ) 2 2 σ 2 2 = k = 1 4 [ ln ( 2 π σ 1 σ 2 ) ] μ 1 2 + ( 1 μ 1 ) 2 + ( 2 μ 1 ) 2 + ( 1 + μ 1 2 ) 2 σ 1 2 ( 2 μ 2 ) 2 + μ 2 2 + ( 2 μ 2 ) 2 + ( 4 μ 2 ) 2 2 σ 2 2 = 4 ln ( 2 π σ 1 σ 2 ) 4 μ 1 2 6 μ 1 + 6 2 σ 1 2 4 μ 2 2 16 μ 2 + 24 2 σ 2 2 \begin{aligned}Q(\theta,\theta^{(0)}) &=\sum_{k=1}^{3}[ \ln p(\boldsymbol{x}_{k}|\theta)]-\ln(2\pi\sigma_{1}\sigma_{2})-\frac{1+\mu_{1}^{2}}{2\sigma_{1}^{2}}-\frac{(4-\mu_{2})^{2}}{2\sigma_{2}^{2}}\\ &=\sum_{k=1}^{4}[-\ln(2\pi\sigma_{1}\sigma_{2})]-\frac{\mu_{1}^{2}+(1-\mu_{1})^{2}+(2-\mu_{1})^{2}+(1+\mu_{1}^{2})}{2\sigma_{1}^{2}}\\ &\qquad-\frac{(2-\mu_{2})^{2}+\mu_{2}^{2}+(2-\mu_{2})^{2}+(4-\mu_{2})^{2}}{2\sigma_{2}^{2}}\\ &=-4\ln(2\pi\sigma_{1}\sigma_{2})-\frac{4\mu_{1}^{2}-6\mu_{1}+6}{2\sigma_{1}^{2}}-\frac{4\mu_{2}^{2}-16\mu_{2}+24}{2\sigma_{2}^{2}} \end{aligned}

\qquad Q ( θ , θ ( 0 ) ) Q(\theta,\theta^{(0)}) 的各个参数求偏导可以得到:
( 1 ) \qquad(1) μ 1 \mu_{1} 求偏导,可得到: 8 μ 1 6 = 0 8\mu_{1}-6=0 ,因此 μ 1 ( 1 ) = 0.75 \mu_{1}^{(1)}=0.75
( 2 ) \qquad(2) μ 2 \mu_{2} 求偏导,可得到: 8 μ 2 16 = 0 8\mu_{2}-16=0 ,因此 μ 2 ( 1 ) = 2 \mu_{2}^{(1)}=2
( 3 ) \qquad(3) σ 1 \sigma_{1} 求偏导,可得到: 4 2 π σ 2 2 π σ 1 σ 2 + 4 μ 1 2 6 μ 1 + 6 σ 1 3 = 0 -4\dfrac{2\pi\sigma_{2}}{2\pi\sigma_{1}\sigma_{2}}+\dfrac{4\mu_{1}^{2}-6\mu_{1}+6}{\sigma_{1}^{3}}=0 ,代入 μ 1 ( 1 ) = 0.75 \mu_{1}^{(1)}=0.75 ,因此 ( σ 1 2 ) ( 1 ) = 0.9375 0.938 (\sigma_{1}^{2})^{(1)}=0.9375\approx0.938
( 4 ) \qquad(4) σ 2 \sigma_{2} 求偏导,可得到: 4 2 π σ 1 2 π σ 1 σ 2 + 4 μ 2 2 16 μ 2 + 24 σ 2 3 = 0 -4\dfrac{2\pi\sigma_{1}}{2\pi\sigma_{1}\sigma_{2}}+\dfrac{4\mu_{2}^{2}-16\mu_{2}+24}{\sigma_{2}^{3}}=0 ,代入 μ 2 ( 1 ) = 2 \mu_{2}^{(1)}=2 ,因此 ( σ 2 2 ) ( 1 ) = 2 (\sigma_{2}^{2})^{(1)}=2

\qquad 因此第一步迭代之后, θ ( 0 ) = [ 0 , 0 , 1 , 1 ] T \theta^{(0)}=[0,0,1,1]^{T} 变成了 θ ( 1 ) = [ 0.75 ,   2 ,   0.938 ,   2 ] T \theta^{(1)}=[0.75,\ 2,\ 0.938,\ 2]^{T}
\qquad

猜你喜欢

转载自blog.csdn.net/xfijun/article/details/103545710