Answers:
否定ANOVA,它假设结果变量呈正态分布(除其他外)。有“旧派”变换可以考虑,但我更喜欢逻辑回归(如您的情况,当只有一个自变量时,等效于卡方)。使用逻辑回归而不是卡方检验的优势在于,如果您发现总体检验有显着结果(类型3),则可以轻松地使用线性对比比较治疗的特定水平。例如,A与B,B与C等。
为了清楚起见,添加了更新:
拿到手边的数据(来自Allison的post doc数据集)并按如下所示使用变量cits,这是我的观点:
postdocData$citsBin <- ifelse(postdocData$cits>2, 3, postdocData$cits)
postdocData$citsBin <- as.factor(postdocData$citsBin)
ordered(postdocData$citsBin, levels=c("0", "1", "2", "3"))
contrasts(postdocData$citsBin) <- contr.treatment(4, base=4) # set 4th level as reference
contrasts(postdocData$citsBin)
# 1 2 3
# 0 1 0 0
# 1 0 1 0
# 2 0 0 1
# 3 0 0 0
# fit the univariate logistic regression model
model.1 <- glm(pdoc~citsBin, data=postdocData, family=binomial(link="logit"))
library(car) # John Fox package
car::Anova(model.1, test="LR", type="III") # type 3 analysis (SAS verbiage)
# Response: pdoc
# LR Chisq Df Pr(>Chisq)
# citsBin 1.7977 3 0.6154
chisq.test(table(postdocData$citsBin, postdocData$pdoc))
# X-squared = 1.7957, df = 3, p-value = 0.6159
# then can test differences in levels, such as: contrast cits=0 minus cits=1 = 0
# Ho: Beta_1 - Beta_2 = 0
cVec <- c(0,1,-1,0)
car::linearHypothesis(model.1, cVec, verbose=TRUE)
但是,一些现代作者对反正弦变换颇有怀疑,例如,请参见http://www.mun.ca/biology/dschneider/b7932/B7932Final10Dec2010.pdf。 但是,这些作者关注的是诸如预测之类的问题,他们指出了反正弦可能会导致问题。如果您只关注假设检验,那应该可以。更现代的方法可以使用逻辑回归。
我想与您对Chi-Sq测试的看法有所不同。即使数据不是二项式的,它也适用。它基于MLE的渐近正态性(在大多数情况下)。
我会做这样的逻辑回归:
哪里
Is the ANOVA equivalent if there is a relation or not.
Is the test is A has some effect.
Is the test is B has some effect.
Is the test is C has some effect.
Now you can do further contrasts to find our what you are interested in. It is still a chi-sq test, but with different degrees of freedom (3, 1, 1, and 1, respectively)
我认为您是对的,不应将ANOVA用于分析二项式因变量。许多人使用它来比较二进制响应变量(0 1)的均值,但不应使用它,因为这严重违反了正态性和均方差假设。卡方检验或Logistic回归最适合这些情况。