逻辑回归求梯度的数学推导

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/welcom_/article/details/84581844

做ex_2的时候,碰到一个求梯度公式,在此手推一波. 如下:

δ J ( θ ) δ θ j = 1 m i = 1 m ( h θ ( x ) ( i ) y ( i ) ) x j ( i ) \frac{\delta J(\theta)}{\delta\theta_j}=\frac {1}{m}\sum_{i=1}^m(h_\theta(x)^{(i)}-y^{(i)})x^{(i)}_j

  1. 前提:cost function J ( θ ) = 1 m i = 1 m [ y ( i ) l o g ( h θ ( x ( i ) ) ) ( 1 y ( i ) ) l o g ( 1 h θ ( x ( i ) ) ) ] J(\theta)=\frac 1m\sum_{i=1}^m[-y^{(i)}log(h_\theta(x^{(i)}))-(1-y^{(i)})log(1-h_\theta(x^{(i)}))] hypothesis
    h θ ( x ( i ) ) = g ( x ( i ) θ ) h_\theta(x^{(i)})=g(x{(i)}\theta) Logistic function
    g ( x ) = 1 1 + e x p ( x ) , e x p ( x ) = e x g(x)=\frac 1{1+exp(-x)}, exp(x)=e^x
  2. 推导过程:
    δ J ( θ ) δ θ j = 1 m ( y ( i ) h θ ( x ( i ) ) h θ ( x ( i ) ) + ( 1 y ( i ) ) h θ ( x ( i ) ) 1 h θ ( x ( i ) ) ) \frac{\delta J(\theta)}{\delta\theta_j}=-\frac1m(y^{(i)}\frac{h^{'}_\theta(x^{(i)})}{h_\theta(x^{(i)})}+(1-y^{(i)})\frac{-h^{'}_\theta(x^{(i)})}{1-h_\theta(x^{(i)})})
    = 1 m y ( i ) h θ ( x ( i ) ( 1 h θ ( x ( i ) ) ) ( 1 y ( i ) ) h θ ( x ( i ) ) h θ ( x ( i ) ) h θ ( x ( i ) ) ( 1 h θ ( x ( i ) ) ) =-\frac1m\frac{y^{(i)}h^{'}_\theta(x^{(i)}(1-h_\theta(x^{(i)}))-(1-y^{(i)}){h^{'}_\theta(x^{(i)})}{h_\theta(x^{(i)})}}{h_\theta(x^{(i)})(1-h_\theta(x^{(i)}))}
    = 1 m ( y ( i ) h θ ( x ( i ) ) ) h θ ( x ( i ) ) h θ ( x ( i ) ) ( 1 h θ ( x ( i ) ) ) . . . . . . . . . . . . . . . ( 1 ) =-\frac1m\frac{(y^{(i)}-h_\theta(x^{(i)}))h^{'}_\theta(x^{(i)})}{h_\theta(x^{(i)})(1-h_\theta(x^{(i)}))}...............(1)
    H ( x ) = x ( i ) θ , h θ ( x ( i ) ) = e H ( x ) ( 1 + e H ( x ) ) 2 H ( x ) . . . . . . . ( 2 ) 设H(x)=x{(i)}\theta, 则h^{'}_\theta(x^{(i)})=-\frac{e^H(x)}{(1+e^-H(x))^2}H^{'}(x).......(2)
    h θ ( x ( i ) ) ( 1 h θ ( x ( i ) ) ) = e H ( x ) ( 1 + e H ( x ) ) 2 . . . . . . . . . . ( 3 ) h_\theta(x^{(i)})(1-h_\theta(x^{(i)}))= \frac{e^H(x)}{(1+e^-H(x))^2}..........(3)
    H ( x ) = θ j ( i ) . . . . . . . . . . . . . . . . ( 4 ) H^{'}(x)=\theta_j^{(i)}................(4)
    ( 2 ) ( 3 ) ( 4 ) 1 δ J ( θ ) δ θ j = 1 m x j ( i ) ( h θ ( x ) ( i ) y ( i ) ) x j ( i ) 将(2)(3)(4)代入(1)得\frac{\delta J(\theta)}{\delta\theta_j}=\frac {1}{m}x^{(i)}_j(h_\theta(x)^{(i)}-y^{(i)})x^{(i)}_j

猜你喜欢

转载自blog.csdn.net/welcom_/article/details/84581844