假设我有一个具有显着相关性的双变量响应。我正在尝试比较两种模拟这些结果的方法。一种方法是对两个结果之间的差异进行建模: 另一种方法是对它们进行使用或建模:
gls
gee
这是一个foo示例:
#create foo data frame
require(mvtnorm)
require(reshape)
set.seed(123456)
sigma <- matrix(c(4,2,2,3), ncol=2)
y <- rmvnorm(n=500, mean=c(1,2), sigma=sigma)
cor(y)
x1<-rnorm(500)
x2<-rbinom(500,1,0.4)
df.wide<-data.frame(id=seq(1,500,1),y1=y[,1],y2=y[,2],x1,x2)
df.long<-reshape(df.wide,idvar="id",varying=list(2:3),v.names="y",direction="long")
df.long<-df.long[order(df.long$id),]
df.wide$diff_y<-df.wide$y2-df.wide$y1
#regressions
fit1<-lm(diff_y~x1+x2,data=df.wide)
fit2<-lm(y~time+x1+x2,data=df.long)
fit3<-gls(y~time+x1+x2,data=df.long, correlation = corAR1(form = ~ 1 | time))
fit1
和之间有什么根本区别fit2
?在fit2
和之间fit3
,假设它们与值和估计值如此接近?
Holland, Paul & Donald Rubin. 1983. On Lord’s Paradox. In Principles of modern psychological measurement: A festchrift for Frederic M. Lord edited by Wainer, Howard & Samuel Messick pgs:3-25. Lawrence Erlbaum Associates. Hillsdale, NJ.