逻辑回归何时以封闭形式解决?


31

X { 0 1 } dx{0,1}dÿ { 0 1 }y{0,1},并假设我们使用logistic回归预测给出y x的任务模型。Logistic回归系数何时可以用封闭形式书写?

一个例子是当我们使用饱和模型时。

也就是说,定义P y | x exp i w i f ix iP(y|x)exp(iwifi(xi)),其中ii{ x 1x d }的幂集中索引集{x1,,xd},并且f ifi返回1如果第ii个集合中的所有变量均为1,否则为0。然后,您可以将此逻辑回归模型中的每个w i表示wi为数据统计量的有理函数的对数。

当存在封闭形式时,还有其他有趣的例子吗?


4
I assume you mean "when are the MLEs of the parameters in closed form?"
Glen_b -Reinstate Monica

Can you give more detail what you did? Your question reads as if you tried to derive the ordinary least squares estimator for a logistic regression problem?
Momo

1
Thanks for the interesting post/question, Yaroslav. Do you have a reference for the example that you show?
Bitwise

1
It's been a while, but possibly it was in Lauritzen's "Graphical Models" book. The broader foundations of the answer for this question are there -- you get closed form solution when the (hyper)graph formed by sufficient statistics is chordal
Yaroslav Bulatov

This might be interesting tandfonline.com/doi/abs/10.1080/… I believe this is a special case of an analytical solution when you only have a 2x2 table
Austin

Answers:


33

As kjetil b halvorsen pointed out, it is, in its own way, a miracle that the linear regression admits an analytical solution. And this is so only by virtue of linearity of the problem (with respect to the parameters). In OLS, you have i(yixiβ)2minβ,

i(yixiβ)2minβ,
which has the first order conditions 2i(yixiβ)xi=0
2i(yixiβ)xi=0
For a problem with pp variables (including constant, if needed—there are some regression through the origin problems, too), this is a system with pp equations and pp unknowns. Most importantly, it is a linear system, so you can find a solution using the standard linear algebra theory and practice. This system will have a solution with probability 1 unless you have perfectly collinear variables.

Now, with logistic regression, things aren't that easy anymore. Writing down the log-likelihood function, l(y;x,β)=iyilnpi+(1yi)ln(1pi),pi=(1+exp(θi))1,θi=xiβ,

l(y;x,β)=iyilnpi+(1yi)ln(1pi),pi=(1+exp(θi))1,θi=xiβ,
and taking its derivative to find the MLE, we get lβ=idpidθ(yipi1yi1pi)xi=i[yi11+exp(xiβ)]xi
lβ=idpidθ(yipi1yi1pi)xi=i[yi11+exp(xiβ)]xi
The parameters ββ enter this in a very nonlinear way: for each ii, there's a nonlinear function, and they are added together. There is no analytical solution (except probably in a trivial situation with two observations, or something like that), and you have to use nonlinear optimization methods to find the estimates ˆββ^.

A somewhat deeper look into the problem (taking the second derivative) reveals that this is a convex optimization problem of finding a maximum of a concave function (a glorified multivariate parabola), so either one exists, and any reasonable algorithm should be finding it rather quickly, or things blow off to infinity. The latter does happen to logistic regression when Prob[Yi=1|xiβ>c]=1Prob[Yi=1|xiβ>c]=1 for some cc, i.e., you have a perfect prediction. This is a rather unpleasant artifact: you would think that when you have a perfect prediction, the model works perfectly, but curiously enough, it is the other way round.


the question is why your last equation is not solvable. is it due to the logistic function's inverse diverging at 0 and 1, or is this due to the nonlinearity in general?
eyaler

5
(+1) Regarding your last paragraph: From a mathematical perspective it does work "perfectly" in the sense that an MLE will yield a perfect separating hyperplane. Whether your numerical algorithm behaves sensibly in that circumstance is a separate matter. Laplace smoothing is often used in such situations.
cardinal

@eyaler, I would say this is due to nonlinearity in general. My understanding is that there is a limited set of circumstances when this can be solved, although I don't know what these circumstances are.
StasK

1
I don't understand, what mathematical condition is present that makes the system not have a closed form solution? Is there a general condition where things in general don't have closed form solutions?
Charlie Parker

does the fact that logistic regression has no closed form something one can prove by looking at the gradient descent iteration for it?
Charlie Parker

8

This post was originally intended as a long comment rather than a complete answer to the question at hand.

From the question, it's a little unclear if the interest lies only in the binary case or, perhaps, in more general cases where they may be continuous or take on other discrete values.

One example that doesn't quite answer the question, but is related, and which I like, deals with item-preference rankings obtained via paired comparisons. The Bradley–Terry model can be expressed as a logistic regression where logit(Pr(Yij=1))=αiαj,

logit(Pr(Yij=1))=αiαj,
and αiαi is an "affinity", "popularity", or "strength" parameter of item ii with Yij=1Yij=1 indicating item ii was preferred over item jj in a paired comparison.

If a full round-robin of comparisons is performed (i.e., a pairwise preference is recorded for each unordered (i,j)(i,j) pair), then it turns out that the rank order of the MLEs ˆαiα^i correspond to the rank order of Si=jiYijSi=jiYij, the sum total of times each object was preferred over another.

To interpret this, imagine a full round-robin tournament in your favorite competitive sport. Then, this result says that the Bradley–Terry model ranks the players/teams according to their winning percentage. Whether this is an encouraging or disappointing result depends on your point of view, I suppose.

NB This rank-ordering result does not hold, in general, when a full round-robin is not played.


2
I was interested in binary because it was easiest to analyze. I have found a very broad sufficient condition in works of Lauritzen -- you get closed form if a corresponding log-linear model is decomposable
Yaroslav Bulatov
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.