多元回归系数的标准误差?


18

我意识到这是一个非常基本的问题,但是我在任何地方都找不到答案。

我正在使用正态方程或QR分解计算回归系数。如何计算每个系数的标准误差?我通常认为标准错误的计算方式如下:

SEx¯ =σx¯n

什么是的每个系数?在OLS上下文中最有效的计算方法是什么?σx¯

Answers:


19

When doing least squares estimation (assuming a normal random component) the regression parameter estimates are normally distributed with mean equal to the true regression parameter and covariance matrix Σ=s2(XTX)1 where s2 is the residual variance and XTX is the design matrix. XT is the transpose of X and X is defined by the model equation Y=Xβ+ϵ with β回归参数,是误差项。通过将X T X 1中的相应项乘以残余方差的样本估计值,然后取平方根,可以得出β参数的估计标准差。这不是一个非常简单的计算,但是任何软件包都会为您计算并在输出中提供。ϵ(XTX)1

上德雷珀和史密斯(在我的注释的参考)的第134页,它们提供以下数据通过最小二乘模型拟合,其中ε Ñ 0 σ 2Y=β0+β1X+εεN(0,Iσ2)

                      X                      Y                    XY
                      0                     -2                     0
                      2                      0                     0
                      2                      2                     4
                      5                      1                     5
                      5                      3                    15
                      9                      1                     9
                      9                      0                     0
                      9                      0                     0
                      9                      1                     9
                     10                     -1                   -10
                    ---                     --                   ---
Sum                  60                      5                    32
Sum of  Squares     482                     21                   528

看起来像一个斜率应接近0的示例。

Xt=(111111111102255999910).

所以

XtX=(nXiXiXi2)=(106060482)

(XtX)1=(Xi2n(XiX¯)2X¯(XiX¯)2X¯(XiX¯)21(XiX¯)2)=(48210(122)612261221122)=(0.3950.0490.0490.008)

X¯=Xi/n=60/10=6

Estimate for β=(XTX)1XTY = ( b0 ) =(Yb-b1 Xb) b1 Sxy/Sxx

b1 = 1/61 = 0.0163 and b0 = 0.5- 0.0163(6) = 0.402

From (XTX)1 above Sb1 =Se (0.008) and Sb0=Se(0.395) where Se is the estimated standard deviation for the error term. Se =√2.3085.

Sorry that the equations didn't carry subscripting and superscripting when I cut and pasted them. The table didn't reproduce well either because the spaces got ignored. The first string of 3 numbers correspond to the first values of X Y and XY and the same for the followinf strings of three. After Sum comes the sums for X Y and XY respectively and then the sum of squares for X Y and XY respectively. The 2x2 matrices got messed up too. The values after the brackets should be in brackets underneath the numbers to the left.


2
Not meant as a plug for my book but i go through the computations of the least squares solution in simple linear regression (Y=aX+b) and calculate the standard errors for a and b, pp.101-103, The Essentials of Biostatistics for Physicians, Nurses, and Clinicians, Wiley 2011. a more detailed description can be found In Draper and Smith Applied Regression Analysis 3rd Edition, Wiley New York 1998 page 126-127. In my answer that follows I will take an example from Draper and Smith.
Michael R. Chernick

8
When I started interacting with this site, Michael, I had similar feelings. With experience, they have changed. It's worthwhile knowing some TEX and once you do, it's (almost) as fast to type it in as it is to type in anything in English. I also learned, by studying exemplary posts (such as many replies by @chl, cardinal, and other high-reputation-per-post users), that providing references, clear illustrations, and well-thought out equations is usually highly appreciated and well received. High quality is one thing distinguishing this site from most others.
whuber

2
That is all nice Bill and it is nice that so many people are dedicated to give those high quality posts. I may use Latex for other purposes, like publishing papers. But I don't have the time to go to all the effort that people expect of me on this site. i am not going to invest the time just to provide service on this site.
Michael R. Chernick

4
I think the disconnect is here: "This is just one of many things about this site that requires those posting to put in extra time and effort" - @whuber and I are both saying that it, in fact, does not take extra time if you know how to do it. We don't learn TEX so that we can post on this site - we (at least I) learn TEX because it's an important skill to have as a statistician and happens to make posts much more readable on this site.
Macro

3
Like many of the people on here, yes, I work as a statistician, but I also happen to find it fun - this site is recreational for me and it's a nice bonus that others find some of my posts useful. If you find marking up your equations with TEX to be work and don't think it's worth learning then so be it, but know that some of your content will be overlooked.
Macro
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.