均匀硬币和有偏硬币之间的统计距离


9

U 均匀分布在 n 位,让 D 被分配 n 位是独立的,每个位是 1 很有可能 1/2ϵ。两者之间的统计距离是否正确DUΩ(ϵn), 什么时候 n1/ϵ2


2
是。之间的统计距离UV 至少是 PrU(xi>n/2)PrD(xi>n/2),这是 Ω(εn); 参见例如matus的答案:cstheory.stackexchange.com/questions/14471/…–
Yury

2
谢谢。也许解释一下如何从matus在我可以接受的答案中所写的内容中得到什么?
Manu


1
关于Matus的答案,您可以做得比Slud的不等式更好。参见arxiv.org/abs/1606.08920中的
Aryeh,

Answers:


7

用表示随机位 x1,,xn。根据定义,UD 至少是 PrU(xit)PrD(xit) 每一个 t。我们选择t=n/2+n

注意 PrU(xit)c1 for some absolute constant c1>0. If PrD(xit)c1/2, then the statistical distance is at least c1/2, and we are done. So we assume below that PrD(xit)c1/2.

Let f(s)=Pr(xit) for i.i.d. Bernoulli random variables x1,,xn with Pr(xi=1)=1/2s. Our goal is to prove that f(0)f(ε)=Ω(εn). By the mean value theorem,

f(0)f(ε)=εf(ξ),
for some ξ(0,ε). Now, we will prove that f(ξ)Ω(n); that will imply that the desired statistical distance is at least Ω(nε), as required.

Write,

f(ξ)=kt(nk)(12ξ)k(12+ξ)nk,
and
f(ξ)=kt(nk)(k(12ξ)k1(12+ξ)nk+(nk)(12ξ)k(12+ξ)nk1)=kt(nk)(12ξ)k(12+ξ)nkk/2+kξ(nk)/2+(nk)ξ(1/2ξ)(1/2+ξ).
Note that
k/2+kξ(nk)/2+(nk)ξ(1/2ξ)(1/2+ξ)=(2kn)/2+nξ(1/2ξ)(1/2+ξ)2(2tn)=4n.
Thus,
f(ξ)4nkt(nk)(12ξ)k(12+ξ)nk=4nf(ξ)4nf(ε)4n(c1/2).
Here, we used the assumption that f(ε)=PrD(x1++xnt)c1/2. We showed that f(ξ)=Ω(n).

5

A somewhat more elementary, and slightly messier proof (or at least it feels so to me).

For convenience, write ε=γn, with γ[0,1) by assumption.

We explicitly lower bound the expression of dTV(P,U):

2dTV(P,U)=x{0,1}n|(12+γn)|x|(12γn)n|x|12n|=12nk=0n(nk)|(1+2γn)k(12γn)nk1|12nk=n2+nn2+2n(nk)|(1+2γn)k(12γn)nk1|Cnk=n2+nn2+2n|(1+2γn)k(12γn)nk1|
where C>0 is an absolute constant. We lower bound each summand separately: fixing k, and writing =kn2[n,2n],
(1+2γn)k(12γn)nk=(14γ2n)n/2(1+2γn12γn)(14γ2n)n/2(1+2γn12γn)nne4γ2γ2
so that each summand is lower bounded by a quantity that converges (when n) to e4γ2γ21>4γ2γ2>2γ; implying that each is Ω(γ). Summing up, this yields
2dTV(P,U)Cnk=n2+nn2+2nΩ(γ)=Ω(γ)=Ω(εn)
as claimed.

(Using Hellinger as a proxy because of its nice properties wrt product distributions is tempting, and would be much faster, but there would be a loss by a quadratic factor in the end lower bound.)
Clement C.

1
Nice! I like the elementary approach. We should be able to make it non-asymptotic in n too.... one way is to use (1+z1z)n(1+2z)n, then use the nice inequality 1+weww2/2. A bit messier.
usul
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.