我有两个数据集,我想知道它们是否存在显着差异(这来自“ 两组显着不同?请测试使用 ”)。
我决定使用置换测试,在R中执行以下操作:
permutation.test <- function(coding, lncrna) {
coding <- coding[,1] # dataset1
lncrna <- lncrna[,1] # dataset2
### Under null hyphotesis, both datasets would be the same. So:
d <- c(coding, lncrna)
# Observed difference
diff.observed = mean(coding) - mean(lncrna)
number_of_permutations = 5000
diff.random = NULL
for (i in 1:number_of_permutations) {
# Sample from the combined dataset
a.random = sample (d, length(coding), TRUE)
b.random = sample (d, length(lncrna), TRUE)
# Null (permuated) difference
diff.random[i] = mean(b.random) - mean(a.random)
}
# P-value is the fraction of how many times the permuted difference is equal or more extreme than the observed difference
pvalue = sum(abs(diff.random) >= abs(diff.observed)) / number_of_permutations
pvalue
}
不过,根据本文,p值不应为0:http://www.statsci.org/smyth/pubs/permp.pdf
你推荐我做什么?以这种方式计算p值:
pvalue = sum(abs(diff.random) >= abs(diff.observed)) / number_of_permutations
一个好方法?还是做以下更好?
pvalue = sum(abs(diff.random) >= abs(diff.observed)) + 1 / number_of_permutations + 1
a.random
b.random
b.random
a.random
coding
lncrna