如何针对重复测量设计计算方差分析:R中的aov()vs lm()


14

标题说明了一切,我很困惑。下面的代码在R中运行重复的aov(),并运行我认为是等效的lm()的调用,但是它们返回不同的误差残差(尽管平方和相同)。

显然,来自aov()的残差和拟合值是模型中使用的残差和拟合值,因为它们的平方和加到summary(my.aov)中报告的每个模型/残差平方和。那么,应用于重复测量设计的实际线性模型是什么?

set.seed(1)
# make data frame,
# 5 participants, with 2 experimental factors, each with 2 levels
# factor1 is A, B
# factor2 is 1, 2
DF <- data.frame(participant=factor(1:5), A.1=rnorm(5, 50, 20), A.2=rnorm(5, 100, 20), B.1=rnorm(5, 20, 20), B.2=rnorm(5, 50, 20))

# get our experimental conditions
conditions <- names(DF)[ names(DF) != "participant" ]

# reshape it for aov
DFlong <- reshape(DF, direction="long", varying=conditions, v.names="value", idvar="participant", times=conditions, timevar="group")

# make the conditions separate variables called factor1 and factor2
DFlong$factor1 <- factor( rep(c("A", "B"), each=10) )
DFlong$factor2 <- factor( rep(c(1, 2), each=5) )

# call aov
my.aov <- aov(value ~ factor1*factor2 + Error(participant / (factor1*factor2)), DFlong)

# similar for an lm() call
fit <- lm(value ~ factor1*factor2 + participant, DFlong)

# what's aov telling us?
summary(my.aov)

# check SS residuals
sum(residuals(fit)^2)       # == 5945.668

# check they add up to the residuals from summary(my.aov)
2406.1 + 1744.1 + 1795.46   # == 5945.66

# all good so far, but how are the residuals in the aov calculated?
my.aov$"participant:factor1"$residuals

#clearly these are the ones used in the ANOVA:
sum(my.aov$"participant:factor1"$residuals ^ 2)

# this corresponds to the factor1 residuals here:
summary(my.aov)


# but they are different to the residuals reported from lm()
residuals(fit)
my.aov$"participant"$residuals
my.aov$"participant:factor1"$residuals
my.aov$"participant:factor1:factor2"$residuals

1
我不确定这是否是您的意思,但是当您也与互动时,您会找到所有SS participant,例如anova(lm(value ~ factor1*factor2*participant, DFlong))
caracal

1
嗯,这很有帮助,好的,因此从模型lm(value〜factor1 * factor2 * participant,DFlong)中,如何计算平方和?即anova()在做什么?
trev 2011年

Answers:


14

想一想一种方法是用工具变量来对待的情况作为3因子方差分析学科之间participantfactor1factor2,和1单元大小anova(lm(value ~ factor1*factor2*participant, DFlong))计算所有的SS在这3单因素方差分析(3个主效应,3所有的影响一阶互动,1个二阶互动)。由于每个单元中只有1个人,因此完整模型没有错误,并且上述调用anova()无法计算F检验。但是SS与设计中的2因数相同。

anova()实际上如何计算效果的SS?通过顺序模型比较(类型I):它适合没有问题的受限模型,也包含包含该效应的非受限模型。与此效应相关的SS是两个模型之间的误差SS之差。

# get all SS from the 3-way between subjects ANOVA
anova(lm(value ~ factor1*factor2*participant, DFlong))

dfL <- DFlong   # just a shorter name for your data frame
names(dfL) <- c("id", "group", "DV", "IV1", "IV2")   # shorter variable names

# sequential model comparisons (type I SS), restricted model is first, then unrestricted
# main effects first
anova(lm(DV ~ 1,      dfL), lm(DV ~ id,         dfL))  # SS for factor id
anova(lm(DV ~ id,     dfL), lm(DV ~ id+IV1,     dfL))  # SS for factor IV1
anova(lm(DV ~ id+IV1, dfL), lm(DV ~ id+IV1+IV2, dfL))  # SS for factor IV2

# now first order interactions
anova(lm(DV ~ id+IV1+IV2, dfL), lm(DV ~ id+IV1+IV2+id:IV1,  dfL))  # SS for id:IV1
anova(lm(DV ~ id+IV1+IV2, dfL), lm(DV ~ id+IV1+IV2+id:IV2,  dfL))  # SS for id:IV2
anova(lm(DV ~ id+IV1+IV2, dfL), lm(DV ~ id+IV1+IV2+IV1:IV2, dfL))  # SS for IV1:IV2

# finally the second-order interaction id:IV1:IV2
anova(lm(DV ~ id+IV1+IV2+id:IV1+id:IV2+IV1:IV2,            dfL),
      lm(DV ~ id+IV1+IV2+id:IV1+id:IV2+IV1:IV2+id:IV1:IV2, dfL))

现在,id:IV1通过从受限模型的误差SS中减去非受限模型的误差SS来检查与交互相关的效果SS 。

sum(residuals(lm(DV ~ id+IV1+IV2,        dfL))^2) -
sum(residuals(lm(DV ~ id+IV1+IV2+id:IV1, dfL))^2)

现在,您已经拥有所有“原始”效果SS,您只需选择正确的误差项来测试效果SS,就可以构建对象内部测试。例如,factor1针对的交互作用SS 测试效果SS participant:factor1

为了对模型比较方法进行出色的介绍,我建议使用Maxwell&Delaney(2004)。设计实验和分析数据。


很好的答案,这确实帮助我最终了解了ANOVA的功能!也感谢您的书籍​​参考!
trev 2011年
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.