Answers:
为补充dsaxton的回答,这里有作为R一些模拟表示的抽样分布q时ķ = 10和q 0 = 0.02:
n_replications <- 10000
k <- 10
failure_prob <- 0.02
n_trials <- k + rnbinom(n_replications, size=k, prob=failure_prob)
all(n_trials >= k) # Sanity check, cannot have 10 failures in < 10 trials
estimated_failure_probability <- k / n_trials
histogram_breaks <- seq(0, max(estimated_failure_probability) + 0.001, 0.001)
## png("estimated_failure_probability.png")
hist(estimated_failure_probability, breaks=histogram_breaks)
abline(v=failure_prob, col="red", lty=2, lwd=2) # True failure probability in red
## dev.off()
mean(estimated_failure_probability) # Around 0.022
sd(estimated_failure_probability)
t.test(x=estimated_failure_probability, mu=failure_prob) # Interval around [0.0220, 0.0223]
它看起来像,这是相对于在可变性相当小的偏差q。
10+rnbinom(10000,10,0.02)
10/(10+rnbinom(10000,10,0.02))
。参数化的依据是成功/失败的次数,而不是试验的总数,因此您必须将k = 10加回来。请注意,无偏估计量将为9/(9+rnbinom(10000,10,0.02))
,分子和分母减少1。