基本自举置信区间的覆盖概率


11

我正在研究的课程存在以下问题:

进行蒙特卡洛研究,以估计标准正常自举置信区间和基本自举置信区间的覆盖概率。从正常人群中抽样,并检查样本均值的经验覆盖率。

标准普通引导程序CI的覆盖率很容易:

n = 1000;
alpha = c(0.025, 0.975);
x = rnorm(n, 0, 1);
mu = mean(x);
sqrt.n = sqrt(n);

LNorm = numeric(B);
UNorm = numeric(B);

for(j in 1:B)
{
    smpl = x[sample(1:n, size = n, replace = TRUE)];
    xbar = mean(smpl);
    s = sd(smpl);

    LNorm[j] = xbar + qnorm(alpha[1]) * (s / sqrt.n);
    UNorm[j] = xbar + qnorm(alpha[2]) * (s / sqrt.n);
}

mean(LNorm < 0 & UNorm > 0); # Approximates to 0.95
# NOTE: it is not good enough to look at overall coverage
# Must compute separately for each tail

根据本课程所学的内容,可以如下计算基本的引导置信区间

# Using x from previous...
R = boot(data = x, R=1000, statistic = function(x, i){ mean(x[i]); });
result = 2 * mu - quantile(R$t, alpha, type=1);

那讲得通。我不了解的是如何计算基本引导CI的覆盖率。我了解覆盖率将代表CI包含真实值的次数(在这种情况下为mu)。我是否boot多次运行该函数?

我该如何不同地对待这个问题?


size=100有错字吗 我不相信您会获得正确的上限和下限,因为在循环中计算配置项时,隐式样本大小似乎为1000(因为sqrt.n在计算中使用了)。另外,为什么要mu直接比较而不是0(后者是真实的均值)?
主教

另外,smpl = x[sample(1:n, size = 100, replace = TRUE)]; 可以简化为smpl = sample(x, size=100, replace=TRUE)
主教

@cardinal-是的,这是一个拼写错误,并且与mu0 相同。正常的CI可以正常工作,这是我遇到的基本引导程序CI。
TheCloudlessSky

Answers:


16

该术语可能不一致,因此以下仅是我对原始问题的理解。据我了解,您计算出的普通配置项不是所要求的。每组引导程序复制给您一个置信区间,但并不多。根据一组引导复制的结果计算不同CI类型的方法如下:

B    <- 999                  # number of replicates
muH0 <- 100                  # for generating data: true mean
sdH0 <- 40                   # for generating data: true sd
N    <- 200                  # sample size
DV   <- rnorm(N, muH0, sdH0) # simulated data: original sample

bootMμSM2σM2t

> getM <- function(orgDV, idx) {
+     bsM   <- mean(orgDV[idx])                       # M*
+     bsS2M <- (((N-1) / N) * var(orgDV[idx])) / N    # S^2*(M)
+     c(bsM, bsS2M)
+ }

> library(boot)                                       # for boot(), boot.ci()
> bOut <- boot(DV, statistic=getM, R=B)
> boot.ci(bOut, conf=0.95, type=c("basic", "perc", "norm", "stud"))
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 999 bootstrap replicates
CALL : 
boot.ci(boot.out = bOut, conf = 0.95, type = c("basic", "perc", "norm", "stud"))

Intervals : 
Level      Normal            Basic         Studentized        Percentile    
95%   ( 95.6, 106.0 )   ( 95.7, 106.2 )  ( 95.4, 106.2 )   ( 95.4, 106.0 )  
Calculations and Intervals on Original Scale

不使用包,boot您可以简单地使用replicate()以获取一组引导复制。

boots <- t(replicate(B, getM(DV, sample(seq(along=DV), replace=TRUE))))

但是,让我们坚持从中boot.ci()获得参考。

boots   <- bOut$t                     # estimates from all replicates
M       <- mean(DV)                   # M from original sample
S2M     <- (((N-1)/N) * var(DV)) / N  # S^2(M) from original sample
Mstar   <- boots[ , 1]                # M* for each replicate
S2Mstar <- boots[ , 2]                # S^2*(M) for each replicate
biasM   <- mean(Mstar) - M            # bias of estimator M

tα/21α/2boot.ci()

(idx <- trunc((B + 1) * c(0.05/2, 1 - 0.05/2)) # indices for sorted vector of estimates
[1] 25 975

> (ciBasic <- 2*M - sort(Mstar)[idx])          # basic CI
[1] 106.21826  95.65911

> (ciPerc <- sort(Mstar)[idx])                 # percentile CI
[1] 95.42188 105.98103

tttz

# standard normal CI with bias correction
> zCrit   <- qnorm(c(0.025, 0.975))   # z-quantiles from std-normal distribution
> (ciNorm <- M - biasM + zCrit * sqrt(var(Mstar)))
[1] 95.5566 106.0043

> tStar <- (Mstar-M) / sqrt(S2Mstar)  # t*
> tCrit <- sort(tStar)[idx]           # t-quantiles from empirical t* distribution
> (ciT  <- M - tCrit * sqrt(S2M))     # studentized t-CI
[1] 106.20690  95.44878

为了估计这些CI类型的覆盖率,您将不得不多次运行此模拟。只需将代码包装到一个函数中,返回带有CI结果的列表,然后replicate()按照本要点中所示的方法运行它。


哇!-关于我做错了的很棒的解释。另外-感谢您的代码提示!这样完美!
TheCloudlessSky

好的最后一个问题:当我尝试复制此信息时,我创建了一个函数computeCIs并称为results = replicate(500, computeCIs());。最后computeCIs它返回c(ciBasic, ciPerc)。为了测试覆盖率概率,我是否应该然后测试mean(results[1, ] < 0 & results[2, ] > 0)以测试所有包含真实均值(覆盖率)的基本CI?当我运行此命令1时,我认为应该得到0.95
TheCloudlessSky

@TheCloudlessSky有关覆盖频率方面的完整功能和完整仿真以及预期结果,请参见pastebin.com/qKpNKK0D
caracal

是的,我是个白痴:)...在R中复制代码时我打错了字...感谢您的所有帮助!:)
TheCloudlessSky

感谢@caracal提供的好答案。链接pastebin.com/qKpNKK0D断开。如果您对其进行更新并提供完整的功能和完整的仿真,将不胜感激。谢谢
MYaseen208
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.