许多PDF从负无穷到正无穷大,但定义了一些方法,而没有定义。哪些共同特征使某些可计算?
许多PDF从负无穷到正无穷大,但定义了一些方法,而没有定义。哪些共同特征使某些可计算?
Answers:
分布的均值是用整数定义的(我将其写为好像是连续分布的,比如说是黎曼积分),但是这个问题更普遍;我们可以继续进行Stieltjes或Lebesgue积分来处理正确且一次全部):
但是,这是什么意思?这实际上是
要么
(尽管您可以将其打破而不只是在0处)
当这些积分的极限不是有限时,问题就来了。
因此,例如,考虑标准柯西密度,它与1成正比 ...注意
令,因此d u =
这不是有限的。下半部分的限制也不是有限的;因此,期望是不确定的。
或者,如果我们有作为我们的随机变量标准柯西的绝对值,它的整个预期是成正比的,我们只是看着这个限制(即)。
另一方面,其他一些密度的确会“无限增大”,但是它们的积分确实有极限。
The other answers are good, but might not convince everyone, especially people who take one look at the Cauchy distribution (with ) and say it's still intuitively obvious that the mean should be zero.
The reason the intuitive answer is not correct from the mathematical perspective is due to the Riemann rearrangement theorem (video).
Effectively what you're doing when you're looking at a Cauchy and saying that the mean "should be zero" is that you're splitting down the "center" at zero, and then claiming the moments of the two sizes balance. Or in other words, you're implicitly doing an infinite sum with "half" the terms positive (the moments at each point to the right) and "half" the terms negative (the moments at each point to the left) and claiming it sums to zero. (For the technically minded: )
The Riemann rearrangement theorem says that this type of infinite sum (one with both positive and negative terms) is only consistent if the two series (positive terms only and negative terms only) are each convergent when taken independently. If both sides (positive and negative) are divergent on their own, then you can come up with an order of summation of the terms such that it sums to any number. (Video above, starting at 6:50)
So, yes, if you do the summation in a balanced manner from 0 out, the first moments from the Cauchy distribution cancel out. However, the (standard) definition of mean doesn't enforce this order of summation. You should be able to sum the moments in any order and have it be equally valid. Therefore, the mean of the Cauchy distribution is undefined - by judiciously choosing how you sum the moments, you can make them "balance" (or not) at practically any point.
So to make the mean of a distribution defined, the two moment integrals need to each be independently convergent (finite) around the proposed mean (which, when you do the math, is really just another way of saying that the full integral () needs to be convergent). If the tails are "fat" enough to make the moment for one side infinite, you're done. You can't balance it out with an infinite moment on the other side.
I should mention that the "counter intuitive" behavior of things like the Cauchy distribution is entirely due to problems when thinking about infinity. Take the Cauchy distribution and chop off the tails - even arbitrarily far out, like at plus/minus the xkcd number - and (once re-normalized) you suddenly get something that's well behaved and has a defined mean. It's not the fat tails in-and-of-themselves that are an issue, it's how those tails behave as you approach infinity.
General Abrial and Glen_b had perfect answers. I just want to add a small demo to show you the mean of Cauchy distribution does not exist / does not converge.
In following experiment, you will see, even you get a large sample and calcluate the empirical mean from the sample, the numbers are quite different from experiment to experiment.
set.seed(0)
par(mfrow=c(1,2))
experiments=rep(1e5,100)
mean_list_cauchy=sapply(experiments, function(n) mean(rcauchy(n)))
mean_list_normal=sapply(experiments, function(n) mean(rnorm(n)))
plot(mean_list_cauchy,ylim=c(-10,10))
plot(mean_list_normal,ylim=c(-10,10))
You can observe that we have experiments, and in each experiment, we sample points from two distributions, with such a big sample size, the empirical mean across different experiments should be fairly close to true mean. The results shows Cauchy distribution does not have a converging mean, but normal distribution has.
EDIT:
As @mark999 mentioned in the chat, we should argue the two distributions used in the experiment has similar "variance" (the reason I use quote is because Cauchy distribution variance is also undefined.). Here is the justification: their PDF are similar.
Note that, by looking at the PDF of Cauchy distribution, we would guess it is , but from the experiments we can see, it does not exist. That is the point of the demo.
curve(dnorm, -8,8)
curve(dcauchy, -8,8)
By definition of Lebesgue-Stieltjes integral, the mean exists if:
https://en.wikipedia.org/wiki/Moment_(mathematics)#Significance_of_the_moments
The Cauchy distribution is a disguised form of a very fundamental distribution, namely the uniform distribution on a circle. In formulas, the infinitesimal probability is , where is the angle coordinate. The probability (or measure) of an arc is . This is different from the uniform distribution , though measures are indeed the same for arcs not containing . For example, on the arc from counter-clockwise to , the mean of the distribution on the circle is . But the mean of the uniform distribution on the corresponding union of two disjoint intervals, each of length , is zero.
Since the distribution on the circle is rotationally symmetric, there cannot be a mean, median or mode on the circle. Similarly, higher moments, such as variance, cannot make sense. This distribution arises naturally in many contexts. For example, my current project involves microscope images of cancerous tissue. The very numerous objects in the image are not symmetric and a "direction" can be assigned to each. The obvious null hypothesis is that these directions are uniformly distributed.
To disguise the simplicity, let be the standard unit circle, and let . We define as a function of by stereographical projection of the circle from onto the -axis. The formula is . Differentiating, we find . The infinitesimal probability is therefore , the usual form of the Cauchy distribution, and "Hey, presto!", simplicity becomes a headache, requiring treatment by the subtleties of integration theory.
In , we can ignore the absence of (in other words, reinstate ) for any consideration such as mean or higher order moment, because the probability of (its measure) is zero. So therefore the non-existence of mean and of higher moments carries over to the real line. However, there is now a special point, namely , which maps to under stereographic projection and this becomes the median and mode of the Cauchy distribution.