给定样本平均值,样本中位数的期望值


16

ÿY表示中值,并让ˉ XX¯表示平均值,大小的随机样本的Ñ = 2 ķ + 1n=2k+1从分发即Ñ μ σ 2N(μ,σ2)。我该如何计算ê Ÿ | ˉ X = ˉ XE(Y|X¯=x¯)

直观地说,因为态假设的,是有意义的要求是Ë Ÿ | ˉ X = ˉ X= ˉ XE(Y|X¯=x¯)=x¯的确是正确的答案。可以严格显示吗?

我最初的想法是使用条件正态分布来解决此问题,这通常是已知的结果。那里的问题是,由于我不知道期望值,因此也不知道中位数的方差,因此我将不得不使用k + 1k+1阶统计量来计算那些值。但这非常复杂,除非绝对必要,否则我不愿去那里。


2
我相信这是我刚刚在stats.stackexchange.com/a/83887上发布的概括的直接结果。残差的分布X - ˉ Xxix¯显然是对称00,从那里它们的中间有一个对称分布,因而其平均值为零。因此,中值本身(不只是残差)的期望值等于0 + È ˉ X | ˉ X = ˉ X= ˉ X  0+E(X¯ | X¯=x¯)=x¯,QED。
whuber

@whuber对不起,残渣?
JohnK 2014年

我在评论中定义了它们:它们是每个x i与平均值之间的差。xi
ub

@whuber不,我了解,但我仍在努力了解您的其他答案如何与我的问题相关,以及您使用的期望到底如何工作。
JohnK

2
@whuber好吧,那请纠正我,如果我错了,Ë ÿ | ˉ X= Ë ˉ X | ˉ X+ é Ÿ - ˉ X | ˉ X现在的第二项为零,因为中位数是对称的左右ˉ X。因此,预期减少至ˉ XE(Y|X¯)=E(X¯|X¯)+E(YX¯|X¯)x¯x¯
JohnK

Answers:


7

X表示原始样品和Ž随机矢量的条目Ž ķ = X ķ - ˉ X。然后Z以法线为中心(但其项不是独立的,从它们的总和为零(很有可能)可以看出)。作为线性功能的X,载体Ž ˉ X是正常的,因此它的协方差矩阵就足够的计算表明ž是独立的ˉ XXZZk=XkX¯ZX(Z,X¯)ZX¯

转到ÿ,人们可以看出Ŷ = ˉ X + Ť其中Ť是的中值Ž。特别是,Ť取决于Ž只因此Ť是独立的ˉ X,和分布Ž是对称的,因此Ť居中。YY=X¯+TTZTZTX¯ZT

最后,È ÿ | ˉ X= ˉ X + È Ť | ˉ X= ˉ X + È Ť = ˉ X

E(YX¯)=X¯+E(TX¯)=X¯+E(T)=X¯.

谢谢,这是差不多一年前提出的,我很高兴有人终于将其清除。
JohnK

7

样本中位数是一个顺序统计量,并且具有非正态分布,因此样本中位数和样本均值(具有正态分布)的联合有限样本分布不会是双变量正态。借助近似,渐近地成立以下条件(请参阅此处的答案):

Ñ [ˉ X Ñ ý Ñ - μ v]大号N [0 0Σ ]

n[(X¯nYn)(μv)]LN[(00),Σ]

Σ=(σ2E(|Xv|)[2f(v)]1E(|Xv|)[2f(v)]1[2f(v)]2)

Σ=(σ2E(|Xv|)[2f(v)]1E(|Xv|)[2f(v)]1[2f(v)]2)

where ˉXnX¯n is the sample mean and μμ the population mean, YnYn is the sample median and vv the population median, f()f() is the probability density of the random variables involved and σ2σ2 is the variance.

So approximately for large samples, their joint distribution is bivariate normal, so we have that

E(YnˉXn=ˉx)=v+ρσvσˉX(ˉxμ)

E(YnX¯n=x¯)=v+ρσvσX¯(x¯μ)

where ρρ is the correlation coefficient.

Manipulating the asymptotic distribution to become the approximate large-sample joint distribution of sample mean and sample median (and not of the standardized quantities), we have ρ=1nE(|Xv|)[2f(v)]11nσ[2f(v)]1=E(|Xv|)σ

ρ=1nE(|Xv|)[2f(v)]11nσ[2f(v)]1=E(|Xv|)σ

So E(YnˉXn=ˉx)=v+E(|Xv|)σ[2f(v)]1σ(ˉxμ)

E(YnX¯n=x¯)=v+E(|Xv|)σ[2f(v)]1σ(x¯μ)

We have that 2f(v)=2/σ2π2f(v)=2/σ2π due to the symmetry of the normal density so we arrive at

E(YnˉXn=ˉx)=v+π2E(|Xμσ|)(ˉxμ)

E(YnX¯n=x¯)=v+π2E(Xμσ)(x¯μ)

where we have used v=μv=μ. Now the standardized variable is a standard normal, so its absolute value is a half-normal distribution with expected value equal to 2/π2/π (since the underlying variance is unity). So

E(YnˉXn=ˉx)=v+π22π(ˉxμ)=v+ˉxμ=ˉx

E(YnX¯n=x¯)=v+π22π(x¯μ)=v+x¯μ=x¯

2
As always, nice answer +1. However, since we have no information about the sample size, the asymptotic distribution might not hold. If there is no way to obtain the exact distribution though, I suppose I'll have to make do. Thank you very much.
JohnK

6

The answer is ˉxx¯.

Let x=(x1,x2,,xn)x=(x1,x2,,xn) have a multivariate distribution FF for which all the marginals are symmetric about a common value μμ. (It does not matter whether they are independent or even are identically distributed.) Define ˉxx¯ to be the arithmetic mean of the xi,xi, ˉx=(x1+x2++xn)/nx¯=(x1+x2++xn)/n and write xˉx=(x1ˉx,x2ˉx,,xnˉx)xx¯=(x1x¯,x2x¯,,xnx¯) for the vector of residuals. The symmetry assumption on FF implies the distribution of xˉxxx¯ is symmetric about 00; that is, when ERnERn is any event,

PrF(xˉxE)=PrF(xˉxE).

PrF(xx¯E)=PrF(xx¯E).

Applying the generalized result at /stats//a/83887 shows that the median of xˉxxx¯ has a symmetric distribution about 00. Assuming its expectation exists (which is certainly the case when the marginal distributions of the xixi are Normal), that expectation has to be 00 (because the symmetry implies it equals its own negative).

Now since subtracting the same value ˉxx¯ from each of a set of values does not change their order, YY (the median of the xixi) equals ˉxx¯ plus the median of xˉxxx¯. Consequently its expectation conditional on ˉxx¯ equals the expectation of xˉxxx¯ conditional on ˉxx¯, plus E(ˉx | ˉx)E(x¯ | x¯). The latter obviously is ˉxx¯ whereas the former is 00 because the unconditional expectation is 00. Their sum is ˉx,x¯, QED.


Thank you for posting it as a full answer. I now understand the essence of your argument but I might ping you if something is still unclear.
JohnK

5
JohnK, I need to alert you to be cautious. A counterexample to this argument has been brought to my attention. I have encouraged its originator to post it here for further discussion, but briefly it concerns a discrete bivariate distribution with symmetric marginals but asymmetric conditional marginals. Its existence points to a flawed deduction early in my argument. I currently hope that the argument might be rescued by imposing stronger conditions on the xixi, but my attention is presently focused elsewhere and I might not get to think about this for awhile.
whuber

4
In the meantime I would encourage you to unaccept this answer. I would ordinarily delete any answer of mine known to be incorrect, but (as you might be able to tell) I like solutions based on first principles rather than detailed calculations, so I hope this argument can be rescued. I therefore intend to leave it open for criticism and improvement (and therefore made it CW); let the votes fall as they may.
whuber

Of course, thanks for letting me know. We will discuss it further when you have time. In the meantime I will settle for the asymptotic argument proposed by @Alecos Papadopoulos.
JohnK

6

This is simpler than the above answers make it. The sample mean is a complete and sufficient statistic (when the variance is known, but our results do not depend on the variance, hence will be valid also in the situation when the variance is unknown). Then the Rao-Blackwell together with the Lehmann-Scheffe theorems (see wikipedia ...) will imply that the conditional expectation of the median, given the arithmetic mean, is the unique minimum variance unbiased estimator of the expectation μμ. But we know that is the arithmetic mean, hence the result follows.

We did also use that the median is an unbiased estimator, which follows from symmetry.


1
By symmetry E[Y]=μE[Y]=μ, indeed. Then from these two theorems we know that E[Y|ˉX]E[Y|X¯] is the Unique Minimum Variance Unbiased Estimator for μμ which we already know to be equal to ˉXX¯. This is a brilliant answer, thank you very much. I would have marked it as the correct one, had I not done that already for another answer.
JohnK
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.