这是确实的东西。为了找出答案,我们需要检查一下我们对关联本身的了解。
一个矢量值随机变量的相关矩阵X=(X1,X2,…,Xp)是方差-协方差矩阵,或简称为“方差”的标准化版本X。也就是说,每个Xi都会替换为最近更新的缩放版本。
Xi和的协方差Xj是它们中心版本的乘积的期望。也就是说,写X′i=Xi−E[Xi]和X′j=Xj−E[Xj],我们有
Cov(Xi,Xj)=E[X′iX′j].
我将写为Var (X)的的方差不是单个数字。它是值Var (X )i j = Cov (X i,X j)的数组 。XVar(X)
Var(X)ij=Cov(Xi,Xj).
考虑预期泛化的协方差的方法是将其视为张量。这意味着它是一个完整的量集合,由i和j索引,范围从1到p,当X进行线性变换时,其值以特别简单的可预测方式改变。具体来说,令Y = (Y 1,Y 2,… ,Y q)是由定义的另一个向量值随机变量vijij1pXY=(Y1,Y2,…,Yq)
Yi=∑j=1pajiXj.
常数(我和Ĵ是索引-Ĵ不是功率)形成q×p阵列甲=(一个ajiijjq×p,j=1,…,p和i=1,…,q。期望的线性意味着A=(aji)j=1,…,pi=1,…,q
Var(Y)ij=∑akialjVar(X)kl.
用矩阵表示法
Var(Y)=AVar(X)A′.
由于极化身份,所有分量实际上都是单变量方差Var(X)
4Cov(Xi,Xj)=Var(Xi+Xj)−Var(Xi−Xj).
这告诉我们,如果您了解单变量随机变量的方差,那么您已经了解了双变量变量的协方差:它们是方差的“正好”线性组合。
问题中的表达式完全类似:变量已按照(1 )进行了标准化。我们可以通过考虑它对任何变量(无论是否标准化)的含义来理解其含义。我们将每个X i替换为其居中版本,如(2 )所示,并形成具有三个索引的数量,Xi(1)Xi(2)
μ3(X)ijk=E[X′iX′jX′k].
These are the central (multivariate) moments of degree 3. As in (4), they form a tensor: when Y=AX, then
μ3(Y)ijk=∑l,m,naliamjankμ3(X)lmn.
The indexes in this triple sum range over all combinations of integers from 1 through p.
The analog of the Polarization Identity is
24μ3(X)ijk=μ3(Xi+Xj+Xk)−μ3(Xi−Xj+Xk)−μ3(Xi+Xj−Xk)+μ3(Xi−Xj−Xk).
μ3μ3(X) as being the multivariate skewness of X. It is a tensor of rank three (that is, with three indices) whose values are linear combinations of the skewnesses of various sums and differences of the Xi. If we were to seek interpretations, then, we would think of these components as measuring in p dimensions whatever the skewness is measuring in one dimension. In many cases,
The first moments measure the location of a distribution;
The second moments (the variance-covariance matrix) measure its spread;
The standardized second moments (the correlations) indicate how the spread varies in p-dimensional space; and
The standardized third and fourth moments are taken to measure the shape of a distribution relative to its spread.
To elaborate on what a multidimensional "shape" might mean, observed that we can understand PCA as a mechanism to reduce any multivariate distribution to a standard version located at the origin and equal spreads in all directions. After PCA is performed, then, μ3 would provide the simplest indicators of the multidimensional shape of the distribution. These ideas apply equally well to data as to random variables, because data can always be analyzed in terms of their empirical distribution.
Reference
Alan Stuart & J. Keith Ord, Kendall's Advanced Theory of Statistics Fifth Edition, Volume 1: Distribution Theory; Chapter 3, Moments and Cumulants. Oxford University Press (1987).
Appendix: Proof of the Polarization Identity
Let x1,…,xn be algebraic variables. There are 2n ways to add and subtract all n of them. When we raise each of these sums-and-differences to the nth power, pick a suitable sign for each of those results, and add them up, we will get a multiple of x1x2⋯xn.
More formally, let S={1,−1}n be the set of all n-tuples of ±1, so that any element s∈S is a vector s=(s1,s2,…,sn) whose coefficients are all ±1. The claim is
2nn!x1x2⋯xn=∑s∈Ss1s2⋯sn(s1x1+s2x2+⋯+snxn)n.(1)
Indeed, the Multinomial Theorem states that the coefficient of the monomial xi11xi22⋯xinn (where the ij are nonnegative integers summing to n) in the expansion of any term on the right hand side is
(ni1,i2,…,in)si11si22⋯sinn.
In the sum (1), the coefficients involving xi11 appear in pairs where one of each pair involves the case s1=1, with coefficient proportional to s1 times si11, equal to 1, and the other of each pair involves the case s1=−1, with coefficient proportional to −1 times (−1)i1, equal to (−1)i1+1. They cancel in the sum whenever i1+1 is odd. The same argument applies to i2,…,in. Consequently, the only monomials that occur with nonzero coefficients must have odd powers of all the xi. The only such monomial is x1x2⋯xn. It appears with coefficient (n1,1,…,1)=n! in all 2n terms of the sum. Consequently its coefficient is 2nn!, QED.
We need take only half of each pair associated with x1: that is, we can restrict the right hand side of (1) to the terms with s1=1 and halve the coefficient on the left hand side to 2n−1n! . That gives precisely the two versions of the Polarization Identity quoted in this answer for the cases n=2 and n=3: 22−12!=4 and 23−13!=24.
Of course the Polarization Identity for algebraic variables immediately implies it for random variables: let each xi be a random variable Xi. Take expectations of both sides. The result follows by linearity of expectation.