3个向量是否可能都具有负的成对相关性?


16

给定三个向量,和,和,和以及和之间相关性是否可能都是负的?即有可能吗?babcabacbc

corr(a,b)<0corr(a,c)<0corr(b,c)<0

3
负相关在几何上意味着居中的矢量相互成钝角。在平面上绘制具有此属性的三个向量的配置应该没有问题。
ub

它们不能完全负相关(ρ=1),但通常可能存在一些负相关,同样由其他相关设置界限。
karakfa

2
@whuber您的意见似乎与Heikki Pulkkinen的回答相矛盾,后者的回答是说飞机上的矢量不可能实现。如果您支持它,则应将您的评论变成答案。
RM

2
@RM胡伯和海基之间没有矛盾。这个问题询问数据矩阵Xn×3大小。通常,我们会谈论3维的n数据点,但是这个Q谈论的是n维的三个“向量” 。Heikki说,如果,则所有负相关都不会发生n=2(实际上,居中后的两个点始终是完美相关的,因此相关必须为±1且不能全部为1)。Whuber说,n维度中的3个向量可以有效地位于2维子空间(即X是排名2),并建议想象一个梅赛德斯标志。
变形虫说恢复莫妮卡

1
相关:绑定三个随机变量的相关性。(cc,@amoeba)
-恢复莫妮卡

Answers:


19

向量的大小可能为3或更大。例如

a=(1,1,1)b=(1,9,3)c=(2,3,1)

相关性是

cor(a,b)=0.80...cor(a,c)=0.27...cor(b,c)=0.34...

我们可以证明,对于大小为2的向量,这是不可能的:

cor(a,b)<02(iaibi)(iai)(ibi)<02(a1b1+a2b2)(a1+a2)(b1b2)<02(a1b1+a2b2)(a1+a2)(b1b2)<02(a1b1+a2b2)a1b1+a1b2+a2b1+a2b2<0a1b1+a2b2a1b2+a2b1<0a1(b1b2)+a2(b2b1)<0(a1a2)(b1b2)<0

The formula makes sense: if a1 is larger than a2, b1 has to be larger than b1 to make the correlation negative.

Similarly for correlations between (a,c) and (b,c) we get

(a1a2)(c1c2)<0(b1b2)(c1c2)<0

Clearly, all of these three formulas can not hold in the same time.


3
Another example of something unexpected that only happens in dimension three or higher.
nth

1
With vectors of size 2, correlations are usually ±1 (straight line through two points), and you cannot have three correlations of 1 with three vectors of any size
Henry

9

Yes, they can.

Suppose you have a multivariate normal distribution XR3,XN(0,Σ). The only restriction on Σ is that it has to be positive semi-definite.

So take the following example Σ=(10.20.20.210.20.20.21)

Its eigenvalues are all positive (1.2, 1.2, 0.6), and you can create vectors with negative correlation.


7

let's start with a correlation matrix for 3 variables

Σ=(1pqp1rqr1)

non-negative definiteness creates constraints for pairwise correlations p,q,r which can be written as

pqrp2+q2+r212

For example, if p=q=1, the values of r is restricted by 2rr2+1, which forces r=1. On the other hand if p=q=12, r can be within 2±34 range.

Answering the interesting follow up question by @amoeba: "what is the lowest possible correlation that all three pairs can simultaneously have?"

Let p=q=r=x<0, Find the smallest root of 2x33x2+1, which will give you 12. Perhaps not surprising for some.

A stronger argument can be made if one of the correlations, say r=1. From the same equation 2pqp2+q2, we can deduce that p=q. Therefore if two correlations are 1, third one should be 1.



2

A simple R function to explore this:

f <- function(n,trials = 10000){
  count <- 0
  for(i in 1:trials){
    a <- runif(n)
    b <- runif(n)
    c <- runif(n)
    if(cor(a,b) < 0 & cor(a,c) < 0 & cor(b,c) < 0){
      count <- count + 1
    }
  }
  count/trials
}

As a function of n, f(n) starts at 0, becomes nonzero at n = 3 (with typical values around 0.06), then increases to around 0.11 by n = 15, after which it seems to stabilize:

enter image description here So, not only is it possible to have all three correlations negative, it doesn't seem to be terribly uncommon (at least for uniform distributions).

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.