SPSS t检验程序在比较2个独立均值时报告2次分析,其中1次假设均等方差,1次假设均等方差。假设方差相等时的自由度(df)始终是整数值(等于n-2)。如果未假定等方差,则df为非整数(例如11.467),并且不接近n-2。我正在寻求对用于计算这些非整数df的逻辑和方法的解释。
SPSS t检验程序在比较2个独立均值时报告2次分析,其中1次假设均等方差,1次假设均等方差。假设方差相等时的自由度(df)始终是整数值(等于n-2)。如果未假定等方差,则df为非整数(例如11.467),并且不接近n-2。我正在寻求对用于计算这些非整数df的逻辑和方法的解释。
Answers:
可以将Welch-Satterthwaite df示为两个自由度的比例加权加权均值,其权重与相应的标准偏差成比例。
原始表达式为:
请注意,是第i 个样本均值的估计方差或该均值的第i个标准误的平方。令r = r 1 / r 2(样本均值的估计方差之比),因此
and then decreases to at ; it's symmetric in .
The second factor is a weighted harmonic mean:
of the d.f., where are the relative weights to the two d.f.
Which is to say, when is very large, it converges to . When is very close to it converges to . When you get twice the harmonic mean of the d.f., and when you get the usual equal-variance t-test d.f., which is also the maximum possible value for .
--
With an equal-variance t-test, if the assumptions hold, the square of the denominator is a constant times a chi-square random variate.
The square of the denominator of the Welch t-test isn't (a constant times) a chi-square; however, it's often not too bad an approximation. A relevant discussion can be found here.
A more textbook-style derivation can be found here.
What you are referring to is the Welch-Satterthwaite correction to the degrees of freedom. The -test when the WS correction is applied is often called Welch's -test. (Incidentally, this has nothing to do with SPSS, all statistical software will be able to conduct Welch's -test, they just don't usually report both side by side by default, so you wouldn't necessarily be prompted to think about the issue.) The equation for the correction is very ugly, but can be seen on the Wikipedia page; unless you are very math savvy or a glutton for punishment, I don't recommend trying to work through it to understand the idea. From a loose conceptual standpoint however, the idea is relatively straightforward: the regular -test assumes the variances are equal in the two groups. If they're not, then the test should not benefit from that assumption. Since the power of the -test can be seen as a function of the residual degrees of freedom, one way to adjust for this is to 'shrink' the df somewhat. The appropriate df must be somewhere between the full df and the df of the smaller group. (As @Glen_b notes below, it depends on the relative sizes of vs ; if the larger n is associated with a sufficiently smaller variance, the combined df can be lower than the larger of the two df.) The WS correction finds the right proportion of way from the former to the latter to adjust the df. Then the test statistic is assessed against a -distribution with that df.