也许没有直接联系?的分布在很大程度上取决于您对的模拟。如果使用模拟,则将具有对数正态分布,给定。然后可以明确找到的分布:使用cdf反cdf和pdf其中与Beta发行版不同。X X Ñ (0 ,1 )EXP (- X β )μ = - 1 β 0 = β 1 = 1个P (甲= 1 | X )˚F (X )= 1 - Φ [ ln (1P(A=1|X)XXN(0,1)exp(−Xβ)μ=−1β0=β1=1P(A=1|X)Q(x)=1
F(x)=1−Φ[ln(1x−1)+1],
Q(x)=11+exp(Φ−1(1−x)−1),
f(x)=1x(1−x)2π−−√exp(−(ln(1/x−1)+1)22),
您可以验证上面R中给出的结果:
n = 100000
X = cbind(rep(1, n), rnorm(n)) # simulate design matrix
Y = 1 / (exp(-X %*% c(1,1)) + 1) # P(A=1|X)
Z1 = 1 / (rlnorm(n, -1, 1) + 1) # simulate from lognormal directly
Z2 = 1 / (1 + exp(qnorm(runif(n)) - 1)) # simulate with inverse CDF
# Kolmogorov–Smirnov test
ks.test(Y, Z1)
ks.test(Y, Z2)
# plot fitted density
new.pdf = function(x) {
1 / (x * (1 - x) * sqrt(2 * pi)) * exp(-0.5 * (log(1 / x - 1) + 1)^2)
}
hist(Y, breaks = "FD", probability = T)
curve(new.pdf, col = 4, add = T)