当中心极限定理和大数定律不一致时


19

从本质上讲,这是我在math.se上发现一个问题的复制,但没有得到我所希望的答案。

令为一系列独立的,均布的随机变量,其中和。{Xi}iNE[Xi]=1V[Xi]=1

考虑对

limnP(1ni=1nXin)

由于不平等事件的两面都趋于无穷大,因此必须对此表达式进行操作。

A)尝试减法

在考虑限制语句之前,请从两侧减去n

limnP(1ni=1nXinnn)=limnP(1ni=1n(Xi1)0)=Φ(0)=12

CLT的最后一个等式,其中Φ()是标准正态分布函数。

B)尝试乘法

将两侧乘以1/n

limnP(1n1ni=1nXi1nn)=limnP(1ni=1nXi1)

=limnP(X¯n1)=limnFX¯n(1)=1

其中FX¯n()是样本均值\ bar X_n的分布函数X¯n,通过LLN可以将概率(以及分布)收敛到常数1,因此是最后的等式。

因此,我们得到了矛盾的结果。 哪个是正确的?又为什么另一个是错误的?


1
@JuhoKokkala当然,这里是math.stackexchange.com/q/283​​0304/87400只需忽略OP的错误即可。
Alecos Papadopoulos

2
我认为问题出在第二次调用LLN的声明中
Glen_b-恢复莫妮卡(Monica),2018年

3
我一直关注着您,直到最终平等。这显然是错误的,因为我们期望对于大近似为,因此其极限值不应等于 它的预期目的是什么?我所知道的不是任何形式的大数定律的陈述。P(X¯n1)1/2n1.
ub

1
@whuber假设样本均值的所有概率都集中在值。如果这是错误的,我相信在答案中详细说明错误是很重要的,这就是这个问题的目的。1
Alecos Papadopoulos

2
Alecos,我关心的不是最后一步是否错误:它关系到您这样做的原因。 毕竟不是这个问题吗?给出这些原因,我仍然没有从您那里得到任何信息,我什至会猜测它们可能是什么。尽管您指的是“ LLN”,但我认为解决问题的方法可能在于准确描述您理解“ LLN”的主张。
whuber

Answers:


15

这里的错误很可能会在以下的事实:在分配收敛隐含的假设是收敛于˚F X 连续性点˚F X 。由于极限分布是一个恒定的随机变量,因此它在x = 1处具有跳跃不连续性,因此无法得出CDF收敛至F x = 1是不正确的。 Fn(x)F(x) F(x)x=1F(x)=1


1
我们定义分布收敛的方式并不排除在不连续点收敛的可能性-只是不需要它。
Alecos Papadopoulos

1
但是,如果分布的收敛不需要收敛到F 1 ,那么问题中的最后等式是什么?Fn(1)F(1)
Juho Kokkala

1
@Juho它不是基于任何东西-这才是关键。 没有定理可以使问题中的最后一个方程式成立。
ub

1
@AlecosPapadopoulos:我从来没有说过,这并不排除可能性。我的意思是,您需要证明最后的平等是合理的,而不是分配趋同给您的。例如,如果为伯努利,则为真。Xn
亚历克斯R.18年

11

对于iid随机变量其中E [ X i ] = var X i= 1,定义 Z nXiE[Xi]=var(Xi)=1 现在,CLT说,对于每一个固定实数ŽLIMÑ→交通˚FžÑż=Φž-1。的OP应用CLT来评估 LIMÑ→交通PżÑ1

Zn=1ni=1nXi,Yn=1ni=1nXi.
zlimnFZn(z)=Φ(z1)
limnP(Zn1n)=Φ(0)=12.

至于其他的答案,以及一些在OP的问题的评论所指出的,它是对OP的评估这是犯罪嫌疑人。考虑特殊情况,即iid X i是离散随机变量,其概率为1且取值02limnP(Yn1)Xi02。现在,Σ Ñ = 1 X可以采取所有甚至整数值在[02Ñ],所以当Ñ为奇数时, Σ Ñ = 1 X不能承担值Ñ因此ÿÑ=112i=1nXi[0,2n]ni=1nXin不能取值1。另外,由于分布ÿÑ是对称的约1,我们有 PÝÑ1=˚FÝÑ1具有值1Yn=1ni=1nXi 1Yn1P(Yn1)=FYn(1)每当n为奇数时为 2。因此,序列号的 PÝ11Pÿ21...PÿÑ1... 包含序列PÝ11Pÿ31PY12n

P(Y11),P(Y21),,P(Yn1),
,其中所有的术语都具有值 1
P(Y11),P(Y31),,P(Y2k11),
。在另一方面,所述亚序列PÝ21Pÿ41...Pÿ2ķ1...会聚1。因此,LIMÑ→交通PÝÑ1不存在和收敛的权利要求PÝÑ112
P(Y21),P(Y41),,P(Y2k1),
1limnP(Yn1)必须非常怀疑地看待1。P(Yn1)

8

您的第一个结果是正确的结果。您的错误发生在第二部分的以下错误语句中:

limnFX¯n(1)=1.

此语句为假(右侧应为),并且不遵循所主张的大数定律。大数定律(您引用)说:12

limnP(|X¯n1|ε)=1for all ε>0.

For all ε>0 the condition |X¯n1|ε spans some values where X¯n1 and some values where X¯n>1. Hence, it does not follow from the LLN that limnP(X¯n1)=1.


1
The (erroneous indeed) result comes from the implication "convergence in probability implies convergence in distribution". The question does not state that the assertion comes directly from the LLN.
Alecos Papadopoulos

@AlecosPapadopoulos: Convergence in probability does imply convergence in distribution. Again, convergence in distribution is required only at points of continuity. But, maybe you meant convergence in probability does not implies pointwise convergence of distribution.
Alex R.

@AlexR. I am not sure where your objection lies. I believe this issue is covered in my own answer.
Alecos Papadopoulos

3

Convergence in probability implies convergence in distribution. But... what distribution? If the limiting distribution has a jump discontinuity then the limits become ambiguous (because multiple values are possible at the discontinuity).

where FX¯n() is the distribution function of the sample mean X¯n, which by the LLN converges in probability (and so also in distribution) to the constant 1,

This is not right, and it is also easy to show that it can not be right (different from the disagreement between CLT and LLN). The limiting distribution (which can be seen as the limit for a sequence of normal distributed variables) should be:

FX¯(x)={0for x<10.5for x=11for x>1

for this function you have that, for any ϵ>0 and every x, the difference |FX¯n(x)FX¯(x)|<ϵ for sufficiently large n. This would fail if FX¯(1)=1 instead of FX¯(1)=0.5


Limit of a normal distribution

It may be helpful to explicitly write out the sum used to invoke the law of large numbers.

X¯n=1ni=1nXiN(1,1n)

The limit n for X^n is actually equivalent to the Dirac Delta function when it is represented as the limit of the normal distribution with the variance going to zero.

Using that expression it is more easy to see what is going on under the hood, rather than using the ready-made laws of the CLT an LLN which obscure the reasoning behind the laws.


Convergence in probability

The law of large numbers gives you 'convergence in probability'

limnP(|X¯n1|>ϵ)=0

with ϵ>0

An equivalent statement could be made for the central limit theorem with limnP(|1n(Xi1)|>ϵn)=0

It is wrong to state that this implies

limnP(|X¯n1|>0)=0

It is less nice that this question is cross-posted so early (confusing, yet interesting to see the different discussions/approaches math vs stats, so not that too bad). The answer by Michael Hardy on the math stackexchange deals with it very effectively in terms of the strong law of large numbers (the same principle as the accepted answer from drhab in the cross posted question and Dilip here). We are almost sure that a sequence X¯1,X¯2,X¯3,...X¯n converges to 1, but this does not mean that limnP(X¯n=1) will be equal to 1 (or it may not even exist as Dilip shows). The dice example in the comments by Tomasz shows this very nicely from a different angle (instead of the limit not existing, the limit goes to zero). The mean of a sequence of dice rolls will converge to the mean of the dice but the probability to be equal to this goes to zero.


Heaviside step function and Dirac delta function

The CDF of X¯n is the following:

FX¯n(x)=12(1+erfx12/n)

with, if you like, limnFX¯n(1)=0.5 (related to the Heaviside step function, the integral of the Dirac delta function when viewed as the limit of normal distribution).


I believe that this view intuitively resolves your question regarding 'show that it is wrong' or at least it shows that the question about understanding the cause of this disagreement of CLT and LLN is equivalent to the question of understanding the integral of the Dirac delta function or a sequence of normal distributions with variance decreasing to zero.


2
Your limiting distribution is in fact not a distribution at all. A CDF must be right continuous, whereas it clearly is not at x=1/2.
Alex R.

The right continuity seems to be necessary such that for every a we have limnFX(a+1n)=FX(a) as the events Xa+1n are nested we should have
limnFX(a+1n)=limnP(Xa+1n)=P(limnXa+1n)=P(Xa)=FX(a)
but is this true for our case and where is the catch? Is this right continuity necessary based on probability axioms or is it just a convention such that the CDF works for most common cases?
Sextus Empiricus

@Martin Weterings: This is precisely where it comes from. Any valid measure P must satisfy these monotonicity results. They are a consequence of the boundedness of P along with countable additivity. More generally, a function F(x) is a CDF (i.e. corresponds to some distribution P via F(b)F(a)=P(a<Xb) iff F is right-continuous, along with being monotonic, and having left limit 0, right limit 1.
Alex R.

2

I believe it should be clear by now that "the CLT approach" gives the right answer.

Let's pinpoint exactly where the "LLN approach" goes wrong.

Starting with the finite statements, it is clear then that we can equivalently either subtract n from both sides, or multliply both sides by 1/n. We get

P(1ni=1nXin)=P(1ni=1n(Xi1)0)=P(1ni=1nXi1)

So if the limit exists, it will be identical. Setting Zn=1ni=1n(Xi1), we have, using distribution functions

P(1ni=1nXin)=FZn(0)=FX¯n(1)

...and it is true that limnFZn(0)=Φ(0)=1/2.

The thinking in the "LLN approach" goes as follows: "We know from the LLN that X¯n converges in probabililty to a constant. And we also know that "convergence in probability implies convergence in distribution". So, X¯n converges in distribution to a constant". Up to here we are correct.
Then we state: "therefore, limiting probabilities for X¯n are given by the distribution function of the constant at 1 random variable",

F1(x)={1x10x<1F1(1)=1

... so limnFX¯n(1)=F1(1)=1...

...and we just made our mistake. Why? Because, as @AlexR. answer noted, "convergence in distribution" covers only the points of continuity of the limiting distribution function. And 1 is a point of discontinuity for F1. This means that limnFX¯n(1) may be equal to F1(1) but it may be not, without negating the "convergence in distribution to a constant" implication of the LLN.

And since from the CLT approach we know what the value of the limit must be (1/2). I do not know of a way to prove directly that limnFX¯n(1)=1/2.

Did we learn anything new?

I did. The LLN asserts that

limnP(|X¯n1|ε)=1for all ε>0

limn[P(1ε<X¯n1)+P(1<X¯n1+ε)]=1

limn[P(X¯n1)+P(1<X¯n1+ε)]=1

The LLN does not say how is the probability allocated in the (1ε,1+ε) interval. What I learned is that, in this class of convergence results, the probability is at the limit allocated equally on the two sides of the centerpoint of the collapsing interval.

The general statement here is, assume

Xnpθ,h(n)(Xnθ)dD(0,V)

where D is some rv with distribution function FD. Then

limnP[Xnθ]=limnP[h(n)(Xnθ)0]=FD(0)

...which may not be equal to Fθ(0) (the distribution function of the constant rv).

Also, this is a strong example that, when the distribution function of the limiting random variable has discontinuities, then "convergence in distribution to a random variable" may describe a situation where "the limiting distribution" may disagree with the "distribution of the limiting random variable" at the discontinuity points. Strictly speaking, the limiting distribution for the continuity points is that of the constant random variable. For the discontinuity points we may be able to calculate the limiting probability, as "separate" entities.


The 'lesson learned' perspective is interesting, and this is a good, not too difficult, example for didactic application. Although I wonder what kind of (direct) practical application this thinking about the infinite has, because eventually in practice n
Sextus Empiricus

@MartijnWeterings Martijn, the motivation here was certainly educational, a) as an alert to discontinuities even in such a "flat" situation as the convergence to a constant, and so also in general (they destroy uniform convergence for example), and b) a result on how the probability mass is allocated becomes interesting when the sequence that converges in probabilty to a constant, still has a non-zero variance.
Alecos Papadopoulos

We could say that CLT let's as say something about convergence to a limiting normal distributed variable (thus being able to express such things as F(x)), but LLN only allows us to say that, by increasing the sample size, we get closer to the true mean, but this does not say that we get, with higher probability, 'exactly equal to the sample mean'. LLN means that the sample mean gets closer and closer to a limiting value but not (with higher probability) equal to it. LLN says nothing about F(x)
Sextus Empiricus

The original thoughts around the LLN where actually opposite (see the reasoning of Arbuthnot stats.stackexchange.com/questions/343268). "It is visible from what has been said, that with a very great Number of Dice, A’s Lot would become very small... there would be but a small part of all the possible Chances, for its happening at any assignable time, that an equal Number of Males and Females should be born."
Sextus Empiricus
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.