@whuber为您指出了三个很好的答案,但是也许我仍然可以写一些有价值的东西。据我了解,您的明确问题是:
鉴于我的拟合模型y^i=m^xi+b^ (通知我加入“帽子”) ,并假设我的残差是正态分布的,,我可以预测,一个尚未未观察到的响应,ÿ ñ ë 瓦特,具有已知预测值,X ñ ë 瓦特,将落入的区间内(Ý - σ ë,ÿ + σN(0,σ^2e)ynewxnew(y^−σe,y^+σe), with probability 68%?
凭直觉,答案似乎应该是“是”,但真正的答案也许是。当参数(即m,b, & σ) are known and without error. Since you estimated these parameters, we need to take their uncertainty into account.
首先考虑一下残差的标准偏差。由于这是根据您的数据估算的,因此估算中可能会有一些错误。结果,您应该用来形成预测间隔的分布应该是,而不是正态分布。但是,由于t迅速收敛到正常值,因此在实践中不太可能成为问题。 tdf errort
因此,我们可以只使用Ÿ新 ± 牛逼(1 - α / 2 ,DF错误)小号,而不是Ÿ新 ± ž (1 - α / 2 ) s ^,去了解我们的快乐的方式?很不幸的是,不行。更大的问题是,有你在那个位置响应的条件均值估计的不确定性,由于不确定性的估算中号&b。从而,y^new±t(1−α/2, df error)sy^new±z(1−α/2)sm^b^您预测的标准偏差需要结合不仅仅是serror。因为方差添加,预测的估计方差将是:
注意,“ X ”被下标来表示为新的特定值观察到,“ s 2 ”相应地被下标。也就是说,您的预测间隔取决于新观测值沿x的位置
s2predictions(new)=s2error+Var(m^xnew+b^)
xs2xspredictions(new)=s2error(1+1N+(xnew−x¯)2∑(xi−x¯)2)−−−−−−−−−−−−−−−−−−−−−−−−√
As an interesting side note, we can infer a few facts about prediction intervals from this equation. First, prediction intervals will be narrower the more data we had when we built the prediction model (this is because there's less uncertainty in
m^ &
b^). Second, predictions will be most precise if they are made at the mean of the
x values you used to develop your model, as the numerator for the third term will be
0. The reason is that under normal circumstances, there is no uncertainty about the estimated slope at the mean of
x, only some uncertainty about the true vertical position of the regression line. Thus, some lessons to be learned for building prediction models are: that more data is helpful, not with finding 'significance', but with improving the precision of future predictions; and that you should center your data collection efforts on the interval where you will need to be making predictions in the future (to minimize that numerator), but spread the observations as widely from that center as you can (to maximize that denominator).
Having calculated the correct value in this manner, we can then use it with the appropriate t distribution as noted above.