是什么使某些分布的平均值不确定？

21

许多PDF从负无穷到正无穷大，但定义了一些方法，而没有定义。哪些共同特征使某些可计算？

distributions mean

— 凯文·诺瓦奇克
source

14

收敛积分。

— Sycorax说恢复莫妮卡

1

这种分布是数学上的抽象。如果积分不收敛，则均值未定义。但是，下面的答案中没有提到的是，负无穷大到正无穷大的PDF无法建模真实的数据源。在现实生活中没有这样的物理过程来生成这样的数据。我认为所有真实数据源都将受到限制，您将能够近似得出均值。

— Cagdas Ozgenc

3

@Cagdas该评论似乎不正确。有很多繁重的过程。他们不同的期望表现为长期平均值的极端变异。例如，对于令人信服的Cauchy模型的应用，请参阅Douglas Zare在stats.stackexchange.com/a/36037/919上的文章。

— ub

2

@CagdasOzgenc：您应该阅读Taleb的《黑天鹅》，以了解推理的错误程度。试探性地讲，可能没有一个完美地生成具有不确定均值或无限均值的分布的过程，但是有许多例子使人们低估了分布的尾巴有多胖，并继续计算均值，而真实分布具有表示完全不同，通常是右偏。这种不正确的推理导致了金融中许多风险评估失误，而风险被低估了多个数量级。

— 亚历克斯R.16年

1

@Cagdas Ozgenc：对于讨论，为什么你的说法是错误的见stats.stackexchange.com/questions/94402/...

— 的Kjetil b HALVORSEN

23

分布的均值是用整数定义的（我将其写为好像是连续分布的，比如说是黎曼积分），但是这个问题更普遍；我们可以继续进行Stieltjes或Lebesgue积分来处理正确且一次全部）：

E (X) = \int_{- \infty}^{\infty} x f (x) d x

$E(X) = \int_{-\infty}^\infty x f(x)\, dx$

但是，这是什么意思？这实际上是

\overset{lim}{_{a \to \infty, b \to \infty}} \int_{- a}^{b} x f (x) d x

$\stackrel{\lim}{_{a\to\infty,b\to\infty}} \int_{-a}^b x\, f(x)\, dx$

要么

\overset{lim}{_{a \to \infty}} \int_{- a}^{0} x f (x) d x + \overset{lim}{_{b \to \infty}} \int_{0}^{b} x f (x) d x

$\stackrel{\lim}{_{a\to\infty}} \int_{-a}^0 x f(x)\, dx \, +\, \stackrel{\lim}{_{b\to\infty}} \int_{0}^b x f(x)\, dx$

（尽管您可以将其打破而不只是在0处）

当这些积分的极限不是有限时，问题就来了。

因此，例如，考虑标准柯西密度，它与成正比 ...注意 $\frac{1}{1+x^2}$

\overset{lim}{_{b \to \infty}} \int_{0}^{b} \frac{x}{1 + x^{2}} d x

$\stackrel{\lim}{_{b\to\infty}} \int_{0}^b \frac{x}{1+x^2}\, dx$

令，因此 $u=1+x^2$ $du=2x\,dx$

= \overset{lim}{_{b \to \infty}} \frac{1}{2} \int_{1}^{1 + b^{2}} \frac{1}{u} d u

$=\,\stackrel{\lim}{_{b\to\infty}}\frac12 \int_{1}^{1+b^2} \frac{1}{u}\, du$

= \overset{lim}{_{b \to \infty}} \frac{_{1}}{^{2}} \ln (u) |_{1}^{1 + b^{2}}

$=\,\stackrel{\lim}{_{b\to\infty}} \frac{_1}{^2}\ln(u)\Bigg |_{1}^{1+b^2}$

= \overset{lim}{_{b \to \infty}} \frac{_{1}}{^{2}} \ln (1 + b^{2})

$=\,\stackrel{\lim}{_{b\to\infty}} \frac{_1}{^2}\ln(1+b^2)$

这不是有限的。下半部分的限制也不是有限的；因此，期望是不确定的。

或者，如果我们有作为我们的随机变量标准柯西的绝对值，它的整个预期是成正比的，我们只是看着这个限制（即）。 $\stackrel{\lim}{_{b\to\infty}} \frac12\ln(1+b^2)$

另一方面，其他一些密度的确会“无限增大”，但是它们的积分确实有极限。

— Glen_b-恢复莫妮卡
source

1

您（当然）也可以在相似的离散概率分布中看到相同的事物。取一个分布，其中对于

，如果

出现的概率与

成正比

n

$n$

n > 0

$n>0$

。概率之和是有限的（因为它必须有极限1，所以也是如此，实际上我们的常数必须是

\frac{1}{n^{2}}

$\frac{1}{n^2}$

或者不管它是），但由于总和

\frac{6}{π^{2}}

$\frac{6}{\pi^2}$

背离没有任何意义。而如果我们选择与

成正比的概率

\frac{1}{n}

$\frac{1}{n}$

\frac{1}{n^{3}}

$\frac{1}{n^3}$ then the mean involves a sum of

\frac{1}{n^{2}}

$\frac{1}{n^2}$ and we're fine, that's "small enough" that it converges.

— Steve Jessop

1

Yes,

\frac{6}{π^{2}}

$\frac{6}{\pi^2}$ is the scaling constant for that (to make it sum to1).

— Glen_b -Reinstate Monica

8

The other answers are good, but might not convince everyone, especially people who take one look at the Cauchy distribution (with $x_0 = 0$ ) and say it's still intuitively obvious that the mean should be zero.

The reason the intuitive answer is not correct from the mathematical perspective is due to the Riemann rearrangement theorem (video).

Effectively what you're doing when you're looking at a Cauchy and saying that the mean "should be zero" is that you're splitting down the "center" at zero, and then claiming the moments of the two sizes balance. Or in other words, you're implicitly doing an infinite sum with "half" the terms positive (the moments at each point to the right) and "half" the terms negative (the moments at each point to the left) and claiming it sums to zero. (For the technically minded: $\int_{0}^\infty f(x_0+r)r\, dr - \int_{0}^{\infty} f(x_0-r)r\, dr = 0$ )

The Riemann rearrangement theorem says that this type of infinite sum (one with both positive and negative terms) is only consistent if the two series (positive terms only and negative terms only) are each convergent when taken independently. If both sides (positive and negative) are divergent on their own, then you can come up with an order of summation of the terms such that it sums to any number. (Video above, starting at 6:50)

So, yes, if you do the summation in a balanced manner from 0 out, the first moments from the Cauchy distribution cancel out. However, the (standard) definition of mean doesn't enforce this order of summation. You should be able to sum the moments in any order and have it be equally valid. Therefore, the mean of the Cauchy distribution is undefined - by judiciously choosing how you sum the moments, you can make them "balance" (or not) at practically any point.

So to make the mean of a distribution defined, the two moment integrals need to each be independently convergent (finite) around the proposed mean (which, when you do the math, is really just another way of saying that the full integral ( $\int_{-\infty}^\infty f(x)x\, dx$ ) needs to be convergent). If the tails are "fat" enough to make the moment for one side infinite, you're done. You can't balance it out with an infinite moment on the other side.

I should mention that the "counter intuitive" behavior of things like the Cauchy distribution is entirely due to problems when thinking about infinity. Take the Cauchy distribution and chop off the tails - even arbitrarily far out, like at plus/minus the xkcd number - and (once re-normalized) you suddenly get something that's well behaved and has a defined mean. It's not the fat tails in-and-of-themselves that are an issue, it's how those tails behave as you approach infinity.

— R.M.
source

Nice. I wonder if its possible to give an exlicit "order of summation" that leads to, say, two.

— Matthew Drury

@MatthewDrury: p_i and n_i denote positive and negative numbers. Successively find p_i and n_i so that integral over [n_i , p_i] is 2+(1/i) and integral over [n_{i+1},p_i] is 2-(1/i). One could do this explicitly using R, matlab or mathematica, but only for a finite number of terms.

— David Epstein

7

General Abrial and Glen_b had perfect answers. I just want to add a small demo to show you the mean of Cauchy distribution does not exist / does not converge.

In following experiment, you will see, even you get a large sample and calcluate the empirical mean from the sample, the numbers are quite different from experiment to experiment.

set.seed(0)
par(mfrow=c(1,2))
experiments=rep(1e5,100)
mean_list_cauchy=sapply(experiments, function(n) mean(rcauchy(n)))
mean_list_normal=sapply(experiments, function(n) mean(rnorm(n)))
plot(mean_list_cauchy,ylim=c(-10,10))
plot(mean_list_normal,ylim=c(-10,10))

You can observe that we have $100$ experiments, and in each experiment, we sample $1\times 10^5$ points from two distributions, with such a big sample size, the empirical mean across different experiments should be fairly close to true mean. The results shows Cauchy distribution does not have a converging mean, but normal distribution has.

EDIT:

As @mark999 mentioned in the chat, we should argue the two distributions used in the experiment has similar "variance" (the reason I use quote is because Cauchy distribution variance is also undefined.). Here is the justification: their PDF are similar.

Note that, by looking at the PDF of Cauchy distribution, we would guess it is $0$ , but from the experiments we can see, it does not exist. That is the point of the demo.

curve(dnorm, -8,8)
curve(dcauchy, -8,8)

— Haitao Du
source

4

I don't think this shows that the Cauchy distribution has no mean. You could get similar results if you replaced the Cauchy distribution by a normal distribution with a suitably large variance.

— mark999

good point @mark999, I will edit my answer to address this problem.

— Haitao Du

Is it possible to figure out from PDF of Cauchy distribution that it has no mean, probably by looking at it's fat tails ?

— ks1322

Perhaps you had something like this in mind? stats.stackexchange.com/questions/90531/…

— Sycorax says Reinstate Monica

2

By definition of Lebesgue-Stieltjes integral, the mean exists if:

\int | x | d F (x) < \infty .

$\int \vert x\vert dF(x)<\infty.$

https://en.wikipedia.org/wiki/Moment_(mathematics)#Significance_of_the_moments

https://en.wikipedia.org/wiki/Lebesgue_integration

— West
source

2

The Cauchy distribution is a disguised form of a very fundamental distribution, namely the uniform distribution on a circle. In formulas, the infinitesimal probability is $d\theta/2\pi$ , where $\theta$ is the angle coordinate. The probability (or measure) of an arc $A\subset \mathbb S^1$ is $\mathtt{length}(A)/2\pi$ . This is different from the uniform distribution $U(-\pi,\pi)$ , though measures are indeed the same for arcs not containing $\pi$ . For example, on the arc from $\pi-\varepsilon$ counter-clockwise to $-\pi+\varepsilon\ (=\pi+\varepsilon \mod 2\pi)$ , the mean of the distribution on the circle is $\pi$ . But the mean of the uniform distribution $U(-\pi,\pi)$ on the corresponding union of two disjoint intervals, each of length $\varepsilon/2\pi$ , is zero.

Since the distribution on the circle is rotationally symmetric, there cannot be a mean, median or mode on the circle. Similarly, higher moments, such as variance, cannot make sense. This distribution arises naturally in many contexts. For example, my current project involves microscope images of cancerous tissue. The very numerous objects in the image are not symmetric and a "direction" can be assigned to each. The obvious null hypothesis is that these directions are uniformly distributed.

To disguise the simplicity, let $\mathbb S^1$ be the standard unit circle, and let $p=(0,1)\in\mathbb S^1$ . We define $x$ as a function of $\theta$ by stereographical projection of the circle from $p$ onto the $x$ -axis. The formula is $x=\tan(\theta/2)$ . Differentiating, we find $d\theta/2 = dx/(1+x^2)$ . The infinitesimal probability is therefore $\frac{d\theta}{\pi(1+x^2)}$ , the usual form of the Cauchy distribution, and "Hey, presto!", simplicity becomes a headache, requiring treatment by the subtleties of integration theory.

In $\mathbb S^1 \setminus \{p\}$ , we can ignore the absence of $p$ (in other words, reinstate $p\in\mathbb S^1$ ) for any consideration such as mean or higher order moment, because the probability of $p$ (its measure) is zero. So therefore the non-existence of mean and of higher moments carries over to the real line. However, there is now a special point, namely $-p = (0,-1)$ , which maps to $0\in\mathbb R$ under stereographic projection and this becomes the median and mode of the Cauchy distribution.

— David Epstein
source

2

柯西分布具有中位数和众数。

— jkabrg's

quite right. I got a bit carried away. But the argument for the non-existence of the mean is correct.. I will edit my answer.

— 大卫·爱泼斯坦

Why is it that "there cannot be a mean because there isn't one on the circle"? There's a lot missing in your argument. I'm assuming what you mean by it being the uniform distribution "on the circle" is that

θ \sim U (- π, π)

$\theta \sim U(-\pi,\pi)$ and

X = \tan (θ / 2)

$X = \tan(\theta/2)$ , but then

E [θ] = 0

$\mathbb E[\theta] = 0$ so I don't understand what you're talking about.

— jkabrg's

@jkabrg: I hope the new edits make this more comprehensible

— David Epstein