截断分布的最大似然估计


28

考虑从随机变量获得的独立样本,假定该随机变量遵循已知(有限)最小值和最大值和的截断分布(例如,截断的正态分布),但是参数和未知。如果遵循非截短的分布中,最大似然估计和为和从将样本均值NSXabμσ2Xμ^σ^2μσ2Sμ^=1NiSi和样本方差。但是,对于截断的分布,以这种方式定义的样本方差以为界,因此它并不总是一致的估计量:对于,它不可能收敛到当达到无穷大时,。因此,对于,和似乎不是和的最大似然估计。当然,这是可以预期的,因为和σ^2=1Ni(Siμ^)2(ba)2σ2>(ba)2σ2Nμ^σ^2μσ2μσ2 截断正态分布的参数不是其均值和方差。

那么,已知最小值和最大值的截断分布的和参数的最大似然估计是多少?μσ


您确定您的分析吗?我认为您做出的假设是无效的:对于截断的情况,的MLE 不再是样本方差(并且,通常,的MLE 不再是样本均值)!σ2μ
whuber

whuber:我知道,这恰恰是我的问题:在截断的情况下,和 MLE是多少?坚持这一点。σ2μ
13年

1
没有封闭式解决方案。您所能做的就是从数值上最小化对数可能性。但这在质量上与许多其他模型(如逻辑回归)没有区别,后者也没有封闭形式的解决方案。
whuber

whuber:如果这是真的,那就太令人失望了。您是否缺乏封闭式解决方案的参考?是否存在不是最大似然但至少一致(并且可选地无偏的)的闭式估计。
13年

1
@whuber:您能否至少将样本简化为足够的统计数据,以便快速进行最小化?
Neil G

Answers:


29

考虑由“标准”分布确定的任何位置范围的族。F

ΩF={F(μ,σ):xF(xμσ)σ>0}.

假设微的,我们很容易发现PDF为。F1σf((xμ)/σ)dx

截断这些分布以将它们的支持限制在和之间,,意味着将PDF替换为aba<b

f(μ,σ;a,b)(x)=f(xμσ)dxσC(μ,σ,a,b),axb

(and are zero for all other values of x) where C(μ,σ,a,b)=F(μ,σ)(b)F(μ,σ)(a) is the normalizing factor needed to ensure that f(μ,σ;a,b) integrates to unity. (Note that C is identically 1 in the absence of truncation.) The log likelihood for iid data xi therefore is

Λ(μ,σ)=i[logf(xiμσ)logσlogC(μ,σ,a,b)].

Critical points (including any global minima) are found where either σ=0 (a special case I will ignore here) or the gradient vanishes. Using subscripts to denote derivatives, we may formally compute the gradient and write the likelihood equations as

0=Λμ=i[fμ(xiμσ)f(xiμσ)Cμ(μ,σ,a,b)C(μ,σ,a,b)]0=Λσ=i[fσ(xiμσ)σ2f(xiμσ)1σCσ(μ,σ,a,b)C(μ,σ,a,b)]

Because a and b are fixed, drop them from the notation and write nCμ(μ,σ,a,b)/C(μ,σ,a,b) as A(μ,σ) and nCσ(μ,σ,a,b)/C(μ,σ,a,b) as B(μ,σ). (With no truncation, both functions would be identically zero.) Separating the terms involving the data from the rest gives

A(μ,σ)=ifμ(xiμσ)f(xiμσ)σ2B(μ,σ)nσ=ifσ(xiμσ)f(xiμσ)

By comparing these to the no-truncation situation it is evident that

  • Any sufficient statistics for the original problem are sufficient for the truncated problem (because the right hand sides have not changed).

  • Our ability to find closed-form solutions relies on the tractability of A and B. If these do not involve μ and σ in simple ways, we cannot hope to obtain closed-form solutions in general.

For the case of a normal family, C(μ,σ,a,b) of course is given by the cumulative normal PDF, which is a difference of error functions: there is no chance that a closed-form solution can be obtained in general. However, there are only two sufficient statistics (the sample mean and variance will do) and the CDF is as smooth as can be, so numerical solutions will be relatively easy to obtain.


Thanks a lot for this very detailed answer! I'm not sure I get what fμ, fσ , Cμ, and Cσ are, could you define them? Also, it's obvious but to be precise maybe you could say that your expression for the pdf is for x[a,b] (and the pdf is zero outside of that). Thanks again!
a3nm

1
The usual longer notation is Cμ=μC(μ,σ,a,b), etc: as announced, it is a derivative. I will make the second change you suggest because it's an important clarification, thanks.
whuber

Also, since your answer is more general than the one I expected, I edited my question to insist less on the case of normal distributions. Thanks again for your effort.
a3nm

1
It was easier to explain at this level of generality compared to focusing on the Normal distributions! Computing the derivatives and showing the precise form of the CDF are unnecessary distractions (although useful when you start actually coding the numerical solution).
whuber

1
Thanks for fixing! You missed one of them; could you review my edit?
a3nm
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.