如何证明


9

我一直在尝试建立不平等

|Ti|=|XiX¯|Sn1n

其中X¯是样品平均值和S样本标准差,即 S=i=1n(XiX¯)2n1

很容易看到i=1nTi2=n1,因此|Ti|<n1但这与我一直在寻找的目标不是很接近,也不是一个有用的界限。我已经试验了柯西-舒瓦兹(Cauchy-Schwarz)和三角形不等式,但没有成功。我必须在某个地方缺少一个微妙的步骤。谢谢您的帮助。

Answers:


10

这是Samuelson的不等式,它需要符号。如果您采用Wikipedia版本并对其进行n1定义重做S,则会发现它变成

|XiX¯|Sn1n

在本书中,这是一个严格的不平等现象,但是我已经解决了,谢谢。
JohnK

5

在通过常规程序简化了问题之后,可以通过将其转换为对偶最小化程序来解决该问题,该程序具有众所周知的答案并带有基本证明。也许这种二元化是问题中提到的“微妙步骤”。不平等也可以在纯机械方式通过最大化建立|Ti| 通过拉格朗日乘数。

不过,首先,我根据最小二乘的几何形状提供了一种更优雅的解决方案。它不需要初步的简化,几乎是立即的,可以直接直观地了解结果。正如问题中所建议的那样,该问题简化为柯西-舒瓦兹不等式。


几何解

视为具有通常点积的欧几里得空间中的n维向量。让ÿ = 0 0 ... 0 1 0 ... Xÿ为的正交投影Xÿ到的正交补1x=(X1,X2,,Xn)n基础矢量和 1 = 1 1 ... 1 。写y=(0,0,,0,1,0,,0)ith1=(1,1,,1)x^y^xy1。(在统计学术语,它们是相对于该装置的残差。)然后,由于小号= | | X | | / XiX¯=x^yS=||x^||/n1

|Ti|=n1|x^y|||x^||=n1|x^y^|||x^||

是的部件ŸX方向。由柯西-施瓦茨,它被精确地最大化时X是平行于Ý = - 1 - 1 ... - 1 ñ - 1 - 1 - 1 ... - 1 / Ñ,为此Ť = ± y^x^x^y^=(1,1,,1,n1,1,1,,1)/nQED。

Ti=±n1y^y^||y^||=±n1||y^||=±n1n,

顺便说一句,这个解决方案提供的所有情况下的详尽描述最大化:它们都是形式|Ti|

x=σy^+μ1=σ(1,1,,1,n1,1,1,,1)+μ(1,1,,1)

对于所有实数μ,σ

该分析很容易地概括为被任何一组回归变量代替的情况。显然,最大的Ť 正比于剩余的长度ÿ| | ÿ | | {1}Tiy||y^||


简化版

由于在位置和尺度的变化下是不变的,因此我们可以不失一般性地假设X i为零,并且它们的平方为n - 1。这标识| Ť | | X i | ,因为S(均方根)是1。最大化等于最大化| Ť | 2 = T 2 TiXin1|Ti||Xi|S1。取 i =不会失去一般性|Ti|2=Ti2=Xi2,或者说,由于 X 是可更换的。i=1Xi


通过双重配方解决

对偶问题是固定的值,并询问剩余X jj 1的值是多少使平方和n j = 1 X 2 j最小,假设n j = 1 X j = 0。因为给定了X 1,所以这是使n j = 2 X 2最小的问题。X12Xj,j1j=1nXj2j=1nXj=0X1鉴于Σ Ñj=2nXj2j=2nXj=X1

该解决方案可以通过多种方式轻松找到。最基本的方法之一就是写

Xj=X1n1+εj, j=2,3,,n

对于其中。扩展目标函数并使用此零和恒等式简化它会产生j=2nεj=0

j=2nXj2=j=2n(X1n1+εj)2=(X1n1)22X1n1εj+εj2=Constant+εj2,

立即表现出特有的解决方案是对所有Ĵ。对于此解决方案,εj=0j

(n1)S2=X12+(n1)(X1n1)2=(1+1n1)X12=nn1X12

|Ti|=|X1|S=|X1|n(n1)2X12=n1n,

QED.


Solution via machinery

Return to the simplified program we began with:

Maximize X12

subject to

i=1nXi=0 and i=1nXi2(n1)=0.

The method of Lagrange multipliers (which is almost purely mechanical and straightforward) equates a nontrivial linear combination of the gradients of these three functions to zero:

(0,0,,0)=λ1D(X12)+λ2D(i=1nXi)+λ3D(i=1nXi2(n1)).

Component by component, these n equations are

0=2λ1X1+λ2+2λ3X10=λ2+2λ3X20=0=λ2+2λ3Xn.

The last n1 of them imply either X2=X3==Xn=λ2/(2λ3) or λ2=λ3=0. (We may rule out the latter case because then the first equation implies λ1=0, trivializing the linear combination.) The sum-to-zero constraint produces X1=(n1)X2. The sum-of-squares constraint provides the two solutions

X1=±n1n; X2=X3==Xn=1n.

They both yield

|Ti|=|X1||±n1n|=n1n.

Thank you for your addendum, geometry is very powerful and of all three solutions it is the most intuitive to me.
JohnK

0

The inequality as stated is true. It is quite clear intuitively that we get the most difficult case for the inequality (that is, maximizing the left hannd side for given S2) by choosing one value, say x1 as large as possible while having all the others equal. Let us look at an example with such configuration:

n=4,x1=x2=x3=0,x4=4,x¯=1,S2=4,
now |xix¯|S={12 or 32 depending on i, while the given upper limit is equal to 412=1.5 which is just enough. That idea can be completed to a proof.

EDIT

We will now prove the claim, as hinted above. First, for any given vector x=(x1,x2,,xn) in this problem, we can replace it with xx¯ without changing either side of the inequality above. So, in the following let us assume that x¯=0. We can also by relabelling assume that x1 is largest. Then, by choosing first x1>0 and then x2=x3==xn=x1n1 we can check by simple algebra that we have equality in the claimed inequality. So, it is sharp.

Then, define the (convex) region R by

R={xR:x¯=0,(xix¯)2/(n1)S2}
for a given positive constant S2. Note that R is the intersection of a hyperplane with a sphere centered at the origin, so is a sphere in (n1)-space. Our problem can now be formulated as
maxxRmaxi|xi|
since an x maximizing that will be the most difficult case for the inequality. This is a problem of finding the maximum of a convex function over a convex set, which in general are difficult problems (minimums are easy!). But, in this case the convex region is a sphere centered on the origin, and the function we want to maximize is the absolute value of the coordinates. It is obvious that that maximum is found at the boundary sphere of R, and by taking |x1| maximal, our first test case is forced.

@JohnK you can delete your comments now, the post is corrected
kjetil b halvorsen

Although this answer shows that the inequality (assuming it is true, which it is) is tight, it isn't evident how that single calculation could be "completed to a proof." Could you provide some indication of how that would be done?
whuber

Will, but tomorrow, now I have to prepare tomorrows class.
kjetil b halvorsen

Thank you--I appreciate your careful formulation of the problem. But your "proof" seems to come to the statement that "it is obvious that." You could always apply Lagrange multipliers to finish the job, but it would be nice to see an approach that (a) actually is a proof and (b) provides insight.
whuber

2
@whuber If you have the time, I would appreciate it if you can post your Lagrange multipliers solution. I think the inequality overall is not as famous as it should be.
JohnK
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.