证明 总偏差平方和 = 回归平方和 + 残差平方和

线性回归中有这样一条性质:
( S S T ) = S S R + S S E 总偏差平方和 (SST) = 回归平方和(SSR) + 残差平方和(SSE)

即:
(1) ( y i y ) 2 = ( y ^ i y ) 2 + ( y i y ) 2 \sum(y_i-\overline y)^2=\sum(\hat y_i-\overline y)^2+\sum(y_i-\overline y)^2\tag{1}

证明:下面以一元回归为例证明。
( y i y ) 2 = ( y i y ^ i + y ^ i y ) 2 = ( y i y ^ i ) 2 + ( y ^ i y ) 2 + 2 ( y i y ^ i ) ( y ^ i y ) \begin{aligned} \sum(y_i-\overline y)^2&=\sum(y_i-\hat y_i+\hat y_i-\overline y)^2\\ &=\sum(y_i-\hat y_i)^2+\sum(\hat y_i-\overline y)^2+2\sum(y_i-\hat y_i)(\hat y_i-\overline y)\\ \end{aligned}

因此,我们需要证明 ( y i y ^ i ) ( y ^ i y ) = 0 \sum(y_i-\hat y_i)(\hat y_i-\overline y)=0 .

(2) ( y i y ^ i ) ( y ^ i y ) = ( y i y ^ i ) y ^ i y ( y i y ^ i ) \begin{aligned} \sum(y_i-\hat y_i)(\hat y_i-\overline y)&=\sum(y_i-\hat y_i)\hat y_i-\overline y\sum (y_i-\hat y_i)\\ \end{aligned}\tag{2}

根据最小二乘法,若回归方程为: y = β 0 + β 1 x y=\beta_0+\beta_1x ,优化目标是使得 f = ( y i β 0 + β 1 x i ) 2 f=\sum (y_i-\beta_0+\beta_1x_i)^2 最小,通过令一阶导数 f f 为零计算 β 0 , β 1 \beta_0, \beta_1
f β 0 = 2 ( y i β 0 + β 1 x i ) = 0 \begin{aligned} \frac{\partial f}{\partial \beta_0}=-2\sum(y_i-\beta_0+\beta_1x_i)=0 \end{aligned}
由于 y ^ i = β 0 + β 1 x i \hat y_i=\beta_0+\beta_1x_i ,所以
(3) ( y i y ^ i ) = 0 \sum (y_i-\hat y_i)=0\tag{3}

又因为:
f β 1 = 2 x i ( y i β 0 + β 1 x i ) = 0 \begin{aligned} \frac{\partial f}{\partial \beta_1}=2\sum x_i(y_i-\beta_0+\beta_1x_i)=0 \end{aligned}

所以,
(4) ( β 0 + β 1 x i ) ( y i β 0 + β 1 x i ) = y ^ i ( y ^ i y i ) = 0 \sum (\beta_0+\beta_1x_i)(y_i-\beta_0+\beta_1x_i)=\sum\hat y_i(\hat y_i-y_i)=0\tag{4}

综合表达式 (2),(3),(4),表达式(1)成立。因此:
( S S T ) = S S R + S S E 总偏差平方和 (SST) = 回归平方和(SSR) + 残差平方和(SSE)
\Box

发布了280 篇原创文章 · 获赞 496 · 访问量 146万+

猜你喜欢

转载自blog.csdn.net/robert_chen1988/article/details/98536420