Answers:
是的,在广泛的实际设置中,置信区间比较与假设检验之间存在一些简单的关系。 但是,除了验证CI程序和t检验是否适合我们的数据外,我们还必须检查样本量是否相差不大,以及两组样本的标准偏差是否相似。我们也不应尝试通过比较两个置信区间来得出高度精确的p值,而应该为开发有效的近似值感到高兴。
在尝试调和已经给出的两个回复(@John和@Brett)时,在数学上明确表示是有帮助的。适用于此问题的对称两侧置信区间的公式为
其中是的样本均值独立观测,是样本标准差,是期望的测试尺寸(最大的假阳性率),和是上部学生t分布的百分与自由度。(从常规表示法这种轻微的偏差通过避免在任何需要忙乱简化论述 VS 。区别,这将是无关紧要的反正)
使用下标和区分两组独立的数据进行比较,其中对应于两个均值中的较大者,置信区间的不重叠表示为不等式(下置信限1)(上置信限2 ); 即 ,
通过简单的代数运算,可以使它看起来像相应的假设检验的t统计量(以比较两种均值),从而得出
左侧是假设检验中使用的统计量;它通常是相比于学生t分布的与百分自由度:即,。右侧是原始t分布百分位数的有偏加权平均值。
到目前为止的分析证明了@Brett的回答是正确的:似乎没有简单的关系可用。 但是,让我们进一步探讨。我这样做是有启发的,因为从直觉上讲,不重叠的置信区间应该说些什么!
首先,请注意,只有当我们期望和至少近似相等时,这种形式的假设检验才有效。(否则,我们将面临臭名昭著的Behrens-Fisher问题及其复杂性。)在检查的近似相等性之后,我们可以创建以下形式的近似简化
这里,。实际上,我们不应期望这种对置信度极限的非正式比较具有与相同的大小。那么我们的问题是是否存在一个,使得右手边(至少近似等于)等于正确的t统计量。即,对于是什么情况
事实证明,对于相等的样本大小,通过幂定律将和连接(相当准确)。 例如,这是(最低的蓝线),(中的红线),(两种情况)下两者的对数对数图最高金线)。中间的绿色虚线是下面描述的近似值。这些曲线的直线度掩盖了幂定律。它随n = n 1而变化,但不多。
答案的确取决于集合,但是很自然地想知道它实际上是否随样本大小的变化而变化多少。特别是,我们可以希望,对中度到大样本(也许点左右),样本大小差别不大。在这种情况下,我们可以开发一种定量方法来将与关联。
只要样本量彼此之间没有太大差异,这种方法就可以工作。本着简洁的精神,我将报告一个综合公式,用于计算与置信区间大小α对应的测试大小。它是
那是,
在以下常见情况下,此公式相当有效:
Both sample sizes are close to each other, , and is not too extreme ( or so).
One sample size is within about three times the other and the smallest isn't too small (roughly, greater than ) and again is not too extreme.
One sample size is within three times the other and or so.
The relative error (correct value divided by the approximation) in the first situation is plotted here, with the lower (blue) line showing the case , the middle (red) line the case , and the upper (gold) line the case . Interpolating between the latter two, we see that the approximation is excellent for a wide range of practical values of when sample sizes are moderate (around 5-50) and otherwise is reasonably good.
This is more than good enough for eyeballing a bunch of confidence intervals.
To summarize, the failure of two -size confidence intervals of means to overlap is significant evidence of a difference in means at a level equal to , provided the two samples have approximately equal standard deviations and are approximately the same size.
I'll end with a tabulation of the approximation for common values of .
0.1 0.020.05 0.005
0.01 0.0002
0.005 0.00006
For example, when a pair of two-sided 95% CIs () for samples of approximately equal sizes do not overlap, we should take the means to be significantly different, . The correct p-value (for equal sample sizes ) actually lies between () and ().
This result justifies (and I hope improves upon) the reply by @John. Thus, although the previous replies appear to be in conflict, both are (in their own ways) correct.
No, not a simple one at least.
There is, however, an exact correspondence between the t-test of difference between two means and the confidence interval for the difference between the two means.
If the confidence interval for the difference between two means contains zero, a t-test for that difference would fail to reject null at the same level of confidence. Likewise if the confidence interval does not contain 0, the t-test would reject the null.
This is not the same as overlap between confidence intervals for each of the two means.
Under typical assumptions of equal variance, yes, there is a relationship. If the bars overlap by less than the length of one bar * sqrt(2) then a t-test would find them to be significantly different at alpha = 0.05. If the ends of the bars just barely touch then a difference would be found at 0.01. If the confidence intervals for the groups are not equal one typically takes the average and applies the same rule.
Alternatively, if the width of a confidence interval around one of the means is w then the least significant difference between two values is w * sqrt(2). This is simple when you think of the denominator in the independent groups t-test, sqrt(2*MSE/n), and the factor for the CI which, sqrt(MSE/n).
(95% CIs assumed)
There's a simple paper on making inferences from confidence intervals around independent means here. It will answer this question and many other related ones you may have.
Cumming, G., & Finch, S. (2005, March). Inference by eye: confidence intervals, and how to read pictures of data. American Psychologist, 60(2), 170-180.