统一随机变量作为两个随机变量之和


18

取自Grimmet和Stirzaker

证明不可能不是U = X + Y的情况,U=X+Y其中UU在[0,1]上均匀分布,而XXYY是独立且均匀分布的。您不应假定X和Y是连续变量。

一个简单的反证法足够了,其中的情况下XXÿY假定离散通过认为它总是能够找到一个üuü 'u,使得P û û + Ù 'P Ú Û P(Uu+u)P(Uu)P X + ÿ Ù = P X + ý ü + Ú 'P(X+Yu)=P(X+Yu+u)

但是,该证明不能扩展到X YX,Y绝对连续或奇异连续。提示/评论/评论?


3
提示:特征功能是您的朋友。
主教

1
X和Y是同义的,因此它们的特征函数必须相同。不过,您需要使用特征函数而不是力矩生成函数-不能保证X会存在mgf,因此显示mgf具有不可能的属性并不意味着就没有这样的X。所有RV都具有特征函数,所以,如果你表明有一个不可能的属性,则没有这样的X.
蠹虫

1
如果的分布XXÿY具有任何原子,假设P { X = 一个} = P { Ŷ = 一个} = b > 0P{X=a}=P{Y=a}=b>0,则P { X + Ŷ = 2 } b 2 > 0P{X+Y=2a}b2>0,因此X + ýX+Y不能均匀地分布[ 0 1 ][0,1]。因此,没有必要考虑XXY的分布情况Y具有原子。
Dilip Sarwate 2014年

Answers:


13

可以用图片来证明结果:可见的灰色区域显示均匀分布不能分解为两个独立的相同分布变量的总和。

符号

Xÿ是独立同分布的,使得X + ý对均匀分布[ 0 1 ]。这意味着,对于所有0 一个b 1XYX+Y[0,1]0ab1

PR < X + ý b = b - 一个

Pr(a<X+Yb)=ba.

的共同分配的基本支持Xÿ因此是[ 0 1 / 2 ](否则将有阳性概率X + ÿ位于外[ 0 1 ]XY[0,1/2]X+Y[0,1])。

图片

0 < ε < 1 / 4。考虑一下该图,该图显示了如何计算随机变量的总和:0<ϵ<1/4

Figure

潜在的概率分布是X Y 的联合分布。任何事件的概率一个< X + ý b是通过对角带线之间伸展覆盖的总概率给定的X + ÿ = 一个X + ÿ = b。显示了三个这样的条带:从0ϵ,在左下方显示为一个小蓝色三角形;从1 / 2 - ϵ1 + ϵ(X,Y)a<X+Ybx+y=ax+y=b0ϵ1/2ϵ / 21/2+ϵ,显示为一个灰色矩形,上面有两个(黄色和绿色)三角形;从1 ϵ1,在右上方显示为小红色三角形。1ϵ1

图片显示

通过将图中的左下三角形与包含该三角形的左下正方形进行比较,并利用XY的iid假设,显然XY

ε = X + Ý ε < X ε Ý ε = X ε 2

ϵ=Pr(X+Yϵ)<Pr(Xϵ)Pr(Yϵ)=Pr(Xϵ)2.

请注意,不等式是严格的:不可能相等,因为XY都有一定的正概率小于ϵ,但是X + Y > ϵXYϵX+Y>ϵ

同样,将红色三角形与右上角的正方形进行比较,

ε = X + ý > 1 - ε < X > 1 / 2 - ε 2

ϵ=Pr(X+Y>1ϵ)<Pr(X>1/2ϵ)2.

最后,将左上角和右下角的两个相对的三角形与包含它们的对角带进行比较,得出另一个严格的不等式,

2 ε < 2 X ε X > 1 / 2 - ε < 1 / 2 - ε < X + ý 1 / 2 + ε = 2 ε

2ϵ<2Pr(Xϵ)Pr(X>1/2ϵ)<Pr(1/2ϵ<X+Y1/2+ϵ)=2ϵ.

第一个不等式来自前两个(取它们的平方根并乘以它们),而第二个不等式描述了带内三角形的(严格)包含,最后一个等式表示X + Y的均匀性。2 ϵ < 2 ϵ是证明XY不存在的矛盾的结论,QEDX+Y2ϵ<2ϵXY


3
(+1) I like this approach. Recovering my back-of-an-envelope from the wastepaper basket I can see I drew the same diagram, except that I didn't mark on the yellow and green triangles inside the band. I did obtain the inequalities for the blue and red triangles. I played around with them and a few other probabilities, but never thought to investigate the probability of the strip, which turns out to be the criticial step. I wonder what thought process might have motivated this insight?
Silverfish

In fact, where @whuber has yellow and green triangles, I did draw on squares (I'd effectively decomposed [0,0.5]2[0,0.5]2 into a grid). Looking at the step which "describes the (strict) inclusion of the triangles within the band", 2Pr(Xϵ)Pr(X>1/2ϵ)<Pr(1/2ϵ<X+Y1/2+ϵ)2Pr(Xϵ)Pr(X>1/2ϵ)<Pr(1/2ϵ<X+Y1/2+ϵ), I wonder whether this would actually be geometrically more natural with squares capping the band than triangles?
Silverfish

1
@Silver I was reminded of an analysis of sums of uniform distributions I posted a couple of years ago. That suggested visualizing the sum X+YX+Y geometrically. It was immediately evident that a lot of probability had to be concentrated near the corners (0,0)(0,0) and (1/2,1/2)(1/2,1/2) in order for the sum to be uniform and for relatively little probability to be near the center diagonal X+Y=1/2X+Y=1/2. That led to the diagram, which I redrew in Mathematica. At that point the answer wrote itself. Yes, using squares in the center band might be neater.
whuber

Thanks! "Note that the inequality is strict: equality is not possible because there is some positive probability that either of XX or YY is less than ϵϵ but nevertheless X+Y>ϵX+Y>ϵ." I'm not sure I follow this. It seems to me the aim here is to show Pr(X+Yϵ)<Pr(XϵYϵ)Pr(X+Yϵ)<Pr(XϵYϵ), doesn't this require a positive probability for some event AA in which both of XX and YY are less than or equal to ϵϵ and yet X+Y>ϵX+Y>ϵ? It is the "either of" vs "both of" I'm vacillating over.
Silverfish

@Silverfish Thank you; I did not express that as I had intended. You are correct: the language is intended essentially to describe the portion of a little square not inside the triangle.
whuber

10

I tried finding a proof without considering characteristic functions. Excess kurtosis does the trick. Here's the two-line answer: Kurt(U)=Kurt(X+Y)=Kurt(X)/2Kurt(U)=Kurt(X+Y)=Kurt(X)/2 since XX and YY are iid. Then Kurt(U)=1.2Kurt(U)=1.2 implies Kurt(X)=2.4Kurt(X)=2.4 which is a contradiction as Kurt(X)2Kurt(X)2 for any random variable.

Rather more interesting is the line of reasoning that got me to that point. XX (and YY) must be bounded between 0 and 0.5 - that much is obvious, but helpfully means that its moments and central moments exist. Let's start by considering the mean and variance: E(U)=0.5E(U)=0.5 and Var(U)=112Var(U)=112. If XX and YY are identically distributed then we have:

E(X+Y)=E(X)+E(Y)=2E(X)=0.5

E(X+Y)=E(X)+E(Y)=2E(X)=0.5

So E(X)=0.25E(X)=0.25. For the variance we additionally need to use independence to apply:

Var(X+Y)=Var(X)+Var(Y)=2Var(X)=112

Var(X+Y)=Var(X)+Var(Y)=2Var(X)=112

Hence Var(X)=124Var(X)=124 and σX=1260.204σX=1260.204. Wow! That is a lot of variation for a random variable whose support ranges from 0 to 0.5. But we should have expected that, since the standard deviation isn't going to scale in the same way that the mean did.

Now, what's the largest standard deviation that a random variable can have if the smallest value it can take is 0, the largest value it can take is 0.5, and the mean is 0.25? Collecting all the probability at two point masses on the extremes, 0.25 away from the mean, would clearly give a standard deviation of 0.25. So our σXσX is large but not impossible. (I hoped to show that this implied too much probability lay in the tails for X+YX+Y to be uniform, but I couldn't get anywhere with that on the back of an envelope.)

Second moment considerations almost put an impossible constraint on XX so let's consider higher moments. What about Pearson's moment coefficient of skewness, γ1=E(XμX)3σ3X=κ3κ3/22γ1=E(XμX)3σ3X=κ3κ3/22? This exists since the central moments exist and σX0σX0. It is helpful to know some properties of the cumulants, in particular applying independence and then identical distribution gives:

κi(U)=κi(X+Y)=κi(X)+κi(Y)=2κi(X)

κi(U)=κi(X+Y)=κi(X)+κi(Y)=2κi(X)

This additivity property is precisely the generalisation of how we dealt with the mean and variance above - indeed, the first and second cumulants are just κ1=μκ1=μ and κ2=σ2κ2=σ2.

Then κ3(U)=2κ3(X)κ3(U)=2κ3(X) and (κ2(U))3/2=(2κ2(X))3/2=23/2(κ2(X))3/2(κ2(U))3/2=(2κ2(X))3/2=23/2(κ2(X))3/2. The fraction for γ1γ1 cancels to yield Skew(U)=Skew(X+Y)=Skew(X)/2Skew(U)=Skew(X+Y)=Skew(X)/2. Since the uniform distribution has zero skewness, so does XX, but I can't see how a contradiction arises from this restriction.

So instead, let's try the excess kurtosis, γ2=κ4κ22=E(XμX)4σ4X3γ2=κ4κ22=E(XμX)4σ4X3. By a similar argument (this question is self-study, so try it!), we can show this exists and obeys:

Kurt(U)=Kurt(X+Y)=Kurt(X)/2

Kurt(U)=Kurt(X+Y)=Kurt(X)/2

The uniform distribution has excess kurtosis 1.21.2 so we require XX to have excess kurtosis 2.42.4. But the smallest possible excess kurtosis is 22, which is achieved by the Binomial(1,12)Binomial(1,12) Bernoulli distribution.


2
(+1) This is a quite clever approach, which was new to me. Thanks. Note that some of your analysis could have been streamlined by considering a uniform centered at zero. (The equivalence of the problem is immediate.) That would have immediately told you that considering skew was a dead-end.
cardinal

@cardinal: I knew the skew was a dead-end before I worked on it. The purpose was expository: it's a self-study question so I didn't want to solve it in full! Rather I wanted to leave a hint on how to deal with the next level up...
Silverfish

@cardinal: I was in two minds whether to center or not. I did back-of-envelope calculations more conveniently, but in the final analysis we just need (1) a simple case of the general result that Kurt(X1+...+Xn)=1nKurt(X) for iid Xi, (2) that Kurt(U)=1.2 for any uniform distribution, and (3) Kurt(X) exists since X is bounded and σX0 (which is trivial, else σU=0). So none of the key results actually required centering, though bits may have looked less ugly!
Silverfish

Yes, the word "streamlined" was carefully chosen. :-) I did not intend my comment to be read as criticism of your exposition. Cheers.
cardinal

@cardinal Incidentally, variance considerations alone almost worked, but the uniform isn't quite spread out enough. With a bit more probability mass nearer the extremes, e.g. fT(t)=12t2 on [-0.5, 0.5], then Var(T)=.15 and if T=X1+X2 then σX=.15/20.27>0.25 which is impossible as X is bounded by -0.25 and 0.25. Of course, you will see immediately how this relates to the present example! I wonder if the approach generalises, I'm sure other bounded RVs can't be decomposed into sums but require even higher moments investigated to find the contradiction.
Silverfish
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.