科技界也有统一诉求——机器学习基本函数求导过程推导(易懂)

欢迎关注,敬请点赞!

科技界也有统一诉求——线性回归和逻辑斯蒂回归求导表达式统一

线性回归函数求导

h θ ( x i ) = θ 1 x i 1 + θ 2 x i 2 + . . . . . . + θ j x i j = θ T x i J ( θ ) = 1 2 m i = 1 m ( h θ ( x i ) y i ) 2 ( ) J ( θ ) θ j = 2 × [ 1 2 m i = 1 m ( h θ ( x i ) y i ) ] × [ x i j ] = 1 m i = 1 m ( h θ ( x i ) y i ) x i j [ ( ) × x i j ] 模型:h_\theta (x_i) =\theta_1 x_i^1 + \theta_2 x_i^2 + ...... + \theta_j x_i^j = \theta^T x_i\\ 损失函数: J(\theta) = \frac {1}{2m} \sum^m_{i = 1} (h_\theta (x_i) - y_i)^2\\ 求导(复合函数求导链式法则): \frac {\partial J(\theta)} {\partial_{\theta_j}} =2 \times [ \frac {1}{2m} \sum^m_{i = 1} (h_\theta (x_i) - y_i)] \times [x_i^j]\\ = \frac {1}{m} \sum^m_{i = 1} (h_\theta (x_i) - y_i) x_i^j\\ [(预测值 - 实际值) \times x_i ^j]的期望

logistics函数求导

返回顶部
s i g m o i d g ( z ) = 1 1 + e z s i g m o i d g ( z ) z = e z × ( 1 ) ( 1 + e z ) 2 = e z ( 1 + e z ) 2 = g ( z ) × ( 1 g ( z ) ) h θ ( x i ) = g ( θ T x i ) = 1 1 + e θ T x i J ( θ ) = 1 m i = 1 m ( y i l o g ( h θ ( x i ) ) + ( 1 y i ) l o g ( 1 h θ ( x i ) ) ) J ( θ ) = 1 m i = 1 m ( y i l o g ( g ( z ) ) + ( 1 y i ) l o g ( 1 g ( z ) ) ) J ( θ ) g ( z ) = 1 m i = 1 m ( y i g ( z ) + 1 y i 1 g ( z ) × ( 1 ) ) = 1 m i = 1 m y i g ( z ) g ( z ) × ( 1 g ( z ) ) ) sigmoid函数: g(z) = \frac{1} {1 + e^{-z}}\\ sigmoid求导: \frac{\partial{g(z)}}{\partial_z} = -\frac {e^{-z} \times (-1)}{(1+e^{-z})^2} = \frac {e^{-z}}{(1+e^{-z})^2} = g(z) \times (1 - g(z))\\ 模型: h_\theta(x_i) = g(\theta^T x_i) = \frac{1}{1+e^{-\theta^T x_i}}\\ 损失函数:J(\theta) = -\frac{1}{m} \sum^m_{i = 1} (y_i log(h_\theta(x_i) ) + (1 - y_i) log(1 - h_\theta(x_i)))\\ 或者:J(\theta) = -\frac{1}{m} \sum^m_{i = 1} (y_i log(g(z)) + (1 - y_i) log(1 - g(z)))\\ (其中:\frac { \partial J(\theta)}{\partial g(z)} = -\frac{1}{m} \sum^m_{i = 1} ( \frac{y_i}{g(z)} + \frac{1 - y_i}{1 - g(z)} \times(-1) )= -\frac{1}{m} \sum^m_{i = 1} \frac {y_i - g(z)}{g(z) \times (1 - g(z))}) \\

复合函数链式求导
J ( θ ) θ j = J ( θ ) g ( z ) × g ( z ) z × z θ j = [ 1 m i = 1 m y i g ( z ) g ( z ) × ( 1 g ( z ) ) ] × [ g ( z ) × ( 1 g ( z ) ) ] × [ x i j ] = 1 m i = 1 m ( g ( z ) y i ) x i j = 1 m i = 1 m ( h θ ( x i ) y i ) x i j [ ( ) × x i j ] \frac{ \partial J(\theta)}{\partial_{\theta_j}} = \frac { \partial J(\theta)}{\partial g(z)} \times \frac { \partial g(z)}{\partial z} \times \frac { \partial z}{\partial_{\theta_j}}\\ =[ -\frac{1}{m} \sum^m_{i = 1} \frac {y_i - g(z)}{g(z) \times (1 - g(z))}] \times [g(z) \times (1 - g(z))] \times [-x_i^j]\\ = \frac{1}{m} \sum^m_{i = 1} (g(z) - y_i) x_i^j\\ = \frac{1}{m} \sum^m_{i = 1} (h_\theta(x_i) - y_i) x_i^j\\ [(预测值 - 实际值) \times x_i ^j]的期望

sigmoid函数是logistics单分类问题,softmax函数是logistics多分类问题
s o f t m a x : h θ ( x ( i ) ) = [ p ( y ( i ) = 1 x ( i ) ; θ ) p ( y ( i ) = 2 x ( i ) ; θ ) . . . p ( y ( i ) = k x ( i ) ; θ ) ] = 1 j = 1 k e θ j T x ( i ) [ e θ 1 T x ( i ) e θ 2 T x ( i ) . . . e θ k T x ( i ) ] softmax: h_\theta(x^{(i)}) = \begin{bmatrix} p(y^{(i)} = 1| x^{(i)}; \theta)\\ p(y^{(i)} = 2| x^{(i)}; \theta)\\ .\\ .\\ .\\ p(y^{(i)} = k| x^{(i)}; \theta)\\ \end{bmatrix} = \frac{1}{\sum_{j = 1}^k e^{\theta_j^T x^{(i)}}} \begin{bmatrix} e^{\theta_1^T x^{(i)}}\\ e^{\theta_2^T x^{(i)}}\\ .\\ .\\ .\\ e^{\theta_k^T x^{(i)}}\\ \end{bmatrix}

结论:

机器学习中线性回归和logistics回归的损失函数求导结果,均为:
1 m i = 1 m ( h θ ( x i ) y i ) x i j [ ( ) × x i j ] \frac{1}{m} \sum^m_{i = 1} (h_\theta(x_i) - y_i) x_i^j\\ [(预测值 - 实际值) \times x_i ^j]的期望
形式上统一!

码砖不忘忧国!
望两岸早日实现实质统一!

欢迎关注,敬请点赞!
返回顶部

原创文章 43 获赞 14 访问量 2857

猜你喜欢

转载自blog.csdn.net/weixin_45221012/article/details/105068663