为了分析零膨胀的鸟类计数,我想使用R包pscl应用零膨胀的计数模型。但是,查看文档中提供的主要功能之一(?zeroinfl)的示例后,我开始怀疑这些模型的真正优势是什么。根据此处给出的示例代码,我计算了标准泊松,拟泊松和负生物模型,简单的零膨胀泊松和负二项式模型以及零膨胀泊松模型和负二项式模型,其中零分量为回归变量。然后,我检查了观测数据和拟合数据的直方图。(这是复制该代码的代码。)
library(pscl)
data("bioChemists", package = "pscl")
## standard count data models
fm_pois <- glm(art ~ ., data = bioChemists, family = poisson)
fm_qpois <- glm(art ~ ., data = bioChemists, family = quasipoisson)
fm_nb <- glm.nb(art ~ ., data = bioChemists)
## with simple inflation (no regressors for zero component)
fm_zip <- zeroinfl(art ~ . | 1, data = bioChemists)
fm_zinb <- zeroinfl(art ~ . | 1, data = bioChemists, dist = "negbin")
## inflation with regressors
fm_zip2 <- zeroinfl(art ~ fem + mar + kid5 + phd + ment | fem + mar + kid5 + phd +
ment, data = bioChemists)
fm_zinb2 <- zeroinfl(art ~ fem + mar + kid5 + phd + ment | fem + mar + kid5 + phd +
ment, data = bioChemists, dist = "negbin")
## histograms
breaks <- seq(-0.5,20.5,1)
par(mfrow=c(4,2))
hist(bioChemists$art, breaks=breaks)
hist(fitted(fm_pois), breaks=breaks)
hist(fitted(fm_qpois), breaks=breaks)
hist(fitted(fm_nb), breaks=breaks)
hist(fitted(fm_zip), breaks=breaks)
hist(fitted(fm_zinb), breaks=breaks)
hist(fitted(fm_zip2), breaks=breaks)
hist(fitted(fm_zinb2), breaks=breaks)!
我看不到不同模型之间的任何根本区别(除了示例数据对我而言似乎不是“零膨胀” ...);实际上,没有一个模型可以对零的数目进行一半合理的估计。谁能解释零膨胀模型的优点是什么?我想肯定有理由选择这个作为函数的示例。