Answers:
测试统计信息的排列分布不保证是对称的,因此您不能那样做。相反,您添加了两条尾巴。在两个独立样本的情况下,原假设是两个位置参数相等。假设两组的连续分布和均等分布,我们在零假设下具有可交换性。检验统计量是均值之差,在零下E (T )= 0。
为值在原始样品中是Ť EMP,其用于置换值Ť ⋆。♯ (⋅ )是短的东西“的数”,例如,♯ (Ť ⋆)是置换检验统计的数目。然后,p为双面假设-值是p TS = p 左 + p 右,其中
(假设我们具有完整的排列分布)。当我们可以计算出精确的(完整的)置换分布时,让我们比较两种独立样本的两种方法。
set.seed(1234)
Nj <- c(9, 8) # group sizes
DVa <- rnorm(Nj[1], 5, 20)^2 # data group 1
DVb <- rnorm(Nj[2], 10, 20)^2 # data group 2
DVab <- c(DVa, DVb) # data from both groups
IV <- factor(rep(c("A", "B"), Nj)) # grouping factor
idx <- seq(along=DVab) # all indices
idxA <- combn(idx, Nj[1]) # all possible first groups
# function to calculate test statistic for a given permutation x
getDM <- function(x) { mean(DVab[x]) - mean(DVab[!(idx %in% x)]) }
resDM <- apply(idxA, 2, getDM) # test statistic for all permutations
diffM <- mean(DVa) - mean(DVb) # empirical stest statistic
现在计算值,并使用R's 包中的实现验证所提出的解决方案。观察p left ≠ p right,因此计算p t s的方式很重要。coin
> (pL <- sum(resDM <= min(diffM, -diffM)) / length(resDM)) # left p-value
[1] 0.1755245
> (pR <- sum(resDM >= max(diffM, -diffM)) / length(resDM)) # right p-value
[1] 0.1585356
> 2*pL # doubling left p-value
[1] 0.351049
> 2*pR # doubling right p-value
[1] 0.3170712
> pL+pR # two-sided p-value
[1] 0.3340601
> sum(abs(resDM) >= abs(diffM)) / length(resDM) # two-sided p-value (more concise)
[1] 0.3340601
# validate with coin implementation
> library(coin) # for oneway_test()
> oneway_test(DVab ~ IV, alternative="two.sided", distribution="exact")
Exact 2-Sample Permutation Test
data: DVab by IV (A, B)
Z = 1.0551, p-value = 0.3341
alternative hypothesis: true mu is not equal to 0
PS对于仅从排列分布中采样的蒙特卡洛情况,将定义如下: