Answers:
What is the expected distribution of residuals?
它随模型的不同而有所不同,因此一般无法回答。
For example, should the residuals be distributed normally?
一般而言,不。
有一个整个家庭手工业集中在为GLM设计更对称甚至近似于“正态”(即高斯)的残差,例如Pearson残差,Anscombe残差,(调整后的)偏差残差等。请参见James W的第6章。 Hardin和Joseph M. Hilbe(2007)“广义线性模型和扩展”第二版。德克萨斯大学学院:斯塔塔出版社。如果因变量是离散的(指标变量或计数),则要使残差的期望分布精确地为高斯显然很困难。
您可以做的一件事是在模型为真的假设下重复模拟新数据,使用该模拟数据估算模型并计算残差,然后将实际残差与模拟残差进行比较。在Stata中,我将这样做:
sysuse nlsw88, clear
glm wage i.union grade c.ttl_exp##c.ttl_exp, link(log) family(poisson)
// collect which observations were used in estimation and the predicted mean
gen byte touse = e(sample)
predict double mu if touse
// predict residuals
predict resid if touse, anscombe
// prepare variables for plotting a cumulative distribution function
cumul resid, gen(c)
// collect the graph command in the local macro `graph'
local graph "twoway"
// create 19 simulations:
gen ysim = .
forvalues i = 1/19 {
replace ysim = rpoisson(mu) if touse
glm ysim i.union grade c.ttl_exp##c.ttl_exp, link(log) family(poisson)
predict resid`i' if touse, anscombe
cumul resid`i', gen(c`i')
local graph "`graph' line c`i' resid`i', sort lpattern(solid) lcolor(gs8) ||"
}
local graph "`graph' line c resid, sort lpattern(solid) lcolor(black) "
// display the graph
`graph' legend(order(20 "actual residuals" 1 "simulations"))