威布尔分布的EM最大似然估计


24

注意: 我发布的是我的一位前学生的问题,由于技术原因,他自己无法发布。

给定来自pdf的Weibull分布的iid样本, 那里是有用的缺失变量表示 ,因此可以使用关联的EM(期望最大化)算法来查找的MLE ,而不是直接使用数值优化?x1,,xñ

fk(x)=kxk1exkx>0
fk(x)=Zgk(x,z)dz
k

2
有审查​​吗?
ocram '02

2
牛顿·拉普森怎么了?
概率

2
@probabilityislogic:一切都没有问题!我的学生想知道是否有EM版本,仅此
西安

1
您能否举例说明在不同,更简单的上下文中正在寻找什么,例如,观察高斯或统一随机变量?当观察到所有数据时,我(以及其他一些发布者,基于他们的评论)看不到EM与您的问题有何关系。
ahfoss 2014年

1
@probabilityislogic我认为您应该说,“哦,您是说要使用Newton Raphson?”。Weibulls是普通家庭……我认为,因此ML解决方案是独一无二的。因此,EM没什么可“ E”的,所以您只不过是“ M” ing ...而寻找得分方程的根是最好的方法!
2014年

Answers:


7

如果我正确理解了这个问题,我认为答案是肯定的。

。然后,以开头的EM算法迭代类型为zi=xikk^=1

  • E步骤: z^i=xik^

  • M步骤: k^=n[(z^i1)logxi]

这是Aitkin和Clayton(1980)为Weibull比例风险模型建议的迭代的一种特殊情况(无检查且无协变量的情况)。也可以在Aitkin等人(1989)的6.11节中找到。

  • Aitkin,M.和Clayton,D.,1980年。使用GLIM将指数分布,威布尔分布和极值分布拟合到复杂的审查生存数据。应用统计,第156-163页。

  • Aitkin,M.,Anderson,D.,Francis,B.和Hinde,J.,1989。GLIM中的统计建模。牛津大学出版社。纽约。


非常感谢David!把当作缺失的变量,我从未想过...!xik
西安

7

威布尔MLE只有数值解:

β

fλ,β(x)={βλ(xλ)β1e(xλ)β,x00,x<0
β,λ>0

1)Likelihoodfunction

Lx^(λ,β)=i=1Nfλ,β(xi)=i=1Nβλ(xiλ)β1e(xiλ)β=βNλNβei=1N(xiλ)βi=1Nxiβ1

登录Likelihoodfunction

x^(λ,β):=lnLx^(λ,β)=NlnβNβlnλi=1N(xiλ)β+(β1)i=1Nlnxi

2)MLE问题 3)最大化0-gradients:

max(λ,β)R2x^(λ,β)s.t.λ>0β>0
0 它如下: -ñβ1
lλ=Nβ1λ+βi=1Nxiβ1λβ+1=!0lβ=NβNlnλi=1Nln(xiλ)eβln(xiλ)+i=1Nlnxi=!0
λ*=1
Nβ1λ+βi=1Nxiβ1λβ+1=0β1λN+β1λi=1Nxiβ1λβ=01+1Ni=1Nxiβ1λβ=01Ni=1Nxiβ=λβ
λ=(1Ni=1Nxiβ)1β

λ

β=[i=1Nxiβlnxii=1Nxiβlnx¯]1

This equation is only numerically solvable, e.g. Newton-Raphson algorithm. β^ can then be placed into λ to complete the ML estimator for the Weibull distribution.


11
Unfortunately, this does not appear to answer the question in any discernible way. The OP is very clearly aware of Newton-Raphson and related approaches. The feasibility of N-R in no way precludes the existence of a missing-variable representation or associated EM algorithm. In my estimation, the question is not concerned at all with numerical solutions, but rather is probing for insight that might become apparent if an interesting missing-variable approach were demonstrated.
cardinal

@cardinal It is one thing to say there was only numerical solution, and it is another thing to show there is only numerical solution.
emcor

5
Dear @emcor, I think you may be misunderstanding what the question is asking. Perhaps reviewing the other answer and associated comment stream would be helpful. Cheers.
cardinal

@cardinal I agree it is not direct answer, but it is the exact expressions for the MLE's e.g. can be used to verify the EM.
emcor

4

Though this is an old question, it looks like there is an answer in a paper published here: http://home.iitk.ac.in/~kundu/interval-censoring-REVISED-2.pdf

In this work the analysis of interval-censored data, with Weibull distribution as the underlying lifetime distribution has been considered. It is assumed that censoring mechanism is independent and non-informative. As expected, the maximum likelihood estimators cannot be obtained in closed form. In our simulation experiments it is observed that the Newton-Raphson method may not converge many times. An expectation maximization algorithm has been suggested to compute the maximum likelihood estimators, and it converges almost all the times.


1
Can you post a full citation for the paper at the link, in case it goes dead?
gung - Reinstate Monica

1
This is an EM algorithm, but does not do what I believe the OP wants. Rather, the E-step imputes the censored data, after which the M-step uses a fixed point algorithm with the complete data set. So the M-step is not in closed form (which I think is what the OP is looking for).
Cliff AB

1
@CliffAB: thank you for the link (+1) but indeed the EM is naturally induced in this paper by the censoring part. My former student was looking for a plain uncensored iid Weibull likelihood optimisation via EM.
Xi'an

-1

In this case the MLE and EM estimators are equivalent, since the MLE estimator is actually just a special case of the EM estimator. (I am assuming a frequentist framework in my answer; this isn't true for EM in a Bayesian context in which we're talking about MAP's). Since there is no missing data (just an unknown parameter), the E step simply returns the log likelihood, regardless of your choice of k(t). The M step then maximizes the log likelihood, yielding the MLE.

EM would be applicable, for example, if you had observed data from a mixture of two Weibull distributions with parameters k1 and k2, but you didn't know which of these two distributions each observation came from.


6
I think you may have misinterpreted the point of the question, which is: Does there exist some missing-variable interpretation from which one would obtain the given Weibull likelihood (and which would allow an EM-like algorithm to be applied)?
cardinal

4
The question statement in @Xi'an's post is quite clear. I think the reason it hasn't been answered is because any answer is likely nontrivial. (It's interesting, so I wish I had more time to think about it.) At any rate, your comment appears to betray a misunderstanding of the EM algorithm. Perhaps the following will serve as an antidote:
cardinal

6
Let f(x)=πφ(xμ1)+(1π)φ(xμ2) where φ is the standard normal density function. Let F(x)=xf(u)du. With U1,,Un iid standard uniform, take Xi=F1(Ui). Then, X1,,Xn is a sample from a Gaussian mixture model. We can estimate the parameters by (brute-force) maximum likelihood. Is there any missing data in our data-generation process? No. Does it have a latent-variable representation allowing for the use of an EM algorithm? Yes, absolutely.
cardinal

4
My apologies @cardinal; I think I have misunderstood two things about your latest post. Yes, in the GMM problem you could search R2×[0,1] via a brute force ML approach. Also, I now see that the original problem looks for a solution that involves introducing a latent variable that allows for an EM approach to estimating the parameter k in the given density kxk1exk. An interesting problem. Are there any examples of using EM like this in such a simple context? Most of my exposure to EM has been in the context of mixture problems and data imputation.
ahfoss

3
@ahfoss: (+1) to your latest comment. Yes! You got it. As for examples: (i) it shows up in censored data problems, (ii) classical applications like hidden Markov models, (iii) simple threshold models like probit models (e.g., imagine observing the latent Zi instead of Bernoulli Xi=1(Zi>μ)), (iv) estimating variance components in one-way random effects models (and much more complex mixed models), and (v) finding the posterior mode in a Bayesian hierarchical model. The simplest is probably (i) followed by (iii).
cardinal
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.