单边切比雪夫不等式的样本版本是否存在?


32

我对以下单方面的Cantelli版本的Chebyshev不等式感兴趣:

P(XE(X)t)Var(X)Var(X)+t2

基本上,如果您知道总体均值和方差,则可以计算观察到某个值的概率的上限。(至少这是我的理解。)

但是,我想使用样本均值和样本方差,而不是实际总体均值和方差。

我猜想,由于这会带来更多不确定性,因此上限会增加。

是否存在类似于上述的不等式,但是使用样本均值和方差?

编辑:Chebyshev不等式(不是单面)的“样本”类似物,已经制定出来。在维基百科页面有一些细节。但是,我不确定它将如何转化为我上面提到的单面案例。


谢谢Glen_b。这是一个非常有趣的问题。我一直认为切比雪夫不等式是强大的(因为它让您无需进行概率分布就可以进行统计推断)。因此能够将其与样本均值和方差配合使用会非常棒。
casandra 2014年

Answers:


26

是的,我们可以使用样本均值和方差得到类似的结果,也许在此过程中会出现一些小小的意外。

首先,我们需要对问题陈述进行一些细化,并提出一些假设。重要的是,很明显,我们不能希望用右侧的样本方差代替总体方差,因为后者是随机的!所以,我们重新调整我们的注意力就相当于不等式

P(XEXtσ)11+t2.
如果还不清楚的是,这些都是等价的,请注意,我们已经简单地更换ttσ在原来的不平等没有任何一般性损失。

第二,我们假设我们有一个随机样本X1,,Xn和我们感兴趣的一个上界的类似量 P(X1X¯tS),其中X¯是样本均值和S是样品标准偏差。

前进半步

请注意,已经通过将原来片面切比雪夫不等式X1X¯,我们得到的是 ,其中σ2=V- [R X1,这是较小的比的原始版本的右手侧。这很有道理!来自样本的随机变量的任何特定实现都倾向于(略微)接近其所贡献的样本均值,而不是总体均值。正如我们将在下面看到的,在更一般的假设下,我们将用S替换σ

P(X1X¯tσ)11+nn1t2
σ2=Var(X1)σS

单面切比雪夫的示例版本

要求:令是一个随机样本,使得PS = 0 = 0。然后,PX 1 - ˉ X小号1X1,,XnP(S=0)=0特别是,边界的样本版本比原始人口版本更严格

P(X1X¯tS)11+nn1t2.

注意:我们假定具有有限的均值或方差!Xi

证明。这个想法是适应原始的单边切比雪夫不等式的证明,并在此过程中采用对称性。首先,设置为标记方便。然后,观察到 PÝ 1小号= 1Yi=XiX¯

P(Y1tS)=1ni=1nP(YitS)=E1ni=1n1(YitS).

现在,对于任何,在{ 小号> 0 }1 ÿ 小号 = 1 ÿ + Ç 小号小号1 + c ^ 1 ÿ + Ç 小号221 + c ^ 2 š 2c>0{S>0}

1(YitS)=1(Yi+tcStS(1+c))1((Yi+tcS)2t2(1+c)2S2)(Yi+tcS)2t2(1+c)2S2.

然后

1ni1(YitS)1ni(Yi+tcS)2t2(1+c)2S2=(n1)S2+nt2c2S2nt2(1+c)2S2=(n1)+nt2c2nt2(1+c)2,
Y¯=0iYi2=(n1)S2

P(X1X¯tS)(n1)+nt2c2nt2(1+c)2.
C,产量 C=ñ-1个ñŤ2, which after a little algebra establishes the result.

That pesky technical condition

Note that we had to assume P(S=0)=0 in order to be able to divide by S2 in the analysis. This is no problem for absolutely continuous distributions, but poses an inconvenience for discrete ones. For a discrete distribution, there is some probability that all observations are equal, in which case 0=Yi=tS=0 for all i and t>0.

We can wiggle our way out by setting q=P(S=0). Then, a careful accounting of the argument shows that everything goes through virtually unchanged and we get

Corollary 1. For the case q=P(S=0)>0, we have

P(X1X¯tS)(1q)11+nn1t2+q.

Proof. Split on the events {S>0} and {S=0}. The previous proof goes through for {S>0} and the case {S=0} is trivial.

A slightly cleaner inequality results if we replace the nonstrict inequality in the probability statement with a strict version.

Corollary 2. Let q=P(S=0) (possibly zero). Then,

P(X1X¯>tS)(1q)11+nn1t2.

Final remark: The sample version of the inequality required no assumptions on X (other than that it not be almost-surely constant in the nonstrict inequality case, which the original version also tacitly assumes), in essence, because the sample mean and sample variance always exist whether or not their population analogs do.


15

This is just a complement to @cardinal 's ingenious answer. Samuelson Inequality, states that, for a sample of size n, when we have at least three distinct values of the realized xi's, it holds that

xix¯<sn1,i=1,...n
where s is calculated without the bias correction, s=(1ni=1n(xix¯)2)1/2.

Then, using the notation of Cardinal's answer we can state that

P(X1X¯Sn1)=0a.s.[1]

Since we require, three distinct values, we will have S0 by assumption. So setting t=n1 in Cardinal's Inequality (the initial version) we obtain

P(X1X¯Sn1)11+n,[2]

Eq. [2] is of course compatible with eq. [1]. The combination of the two tells us that Cardinal's Inequality is useful as a probabilistic statement for 0<t<n1.

If Cardinal's Inequality requires S to be calculated bias-corrected (call this S~) then the equations become

P(X1X¯S~n1n)=0a.s.[1a]

and we choose t=n1n to obtain through Cardinal's Inequality

P(X1X¯S~n1n)1n,[2a]
and the probabilistically meaningful interval for t is 0<t<n1n.

2
(+1) Incidentally, as I was first considering this problem, the fact that maxi|XiX¯|Sn1 was actually the initial clue that the sample inequality should be tighter than the original. I wanted to squeeze that into my post, but couldn't find a (comfortable) place for it. I'm glad to see you mention it (actually a very slight improvement on it) here along with your very nice additional elaboration. Cheers.
cardinal

Cheers @Cardinal, great answer -just clarify for me -does it matter for your Inequality how one defines the sample variance (bias-corrected or not)?
Alecos Papadopoulos

Only ever so slightly. I used the bias-corrected sample variance. If you use n instead of n1 to normalize, then you'll end up with
1+t2c2t2(1+c)2
instead of
(n1)+nt2c2nt2(1+c)2,
which means the n/(n1) term in the final inequality will disappear. Thus, you'll get the same bound as in the original one-sided Chebyshev inequality in that case. (Assuming I've done the algebra correctly.) :-)
cardinal

@Cardinal ...which means that the relevant equations in my answer are 1a and 2a, which means that your inequality tells us that for t chosen to activate Samuelson Inequality, the probability of the event we are examining, cannot be greater than 1/n, i.e. not greater than randomly choosing any one realized value from the sample... which somehow makes some hazy intuitive sense: what is proven certainly impossible in deterministic terms, when approached probabilistically its probability bound does not exceed equiprobability... not clear in my mind yet.
Alecos Papadopoulos
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.