如何计算python中正态累积分布函数的反函数?


72

如何计算Python中正态分布的累积分布函数(CDF)的反函数?

我应该使用哪个库?可能是卑鄙的?


1
你的意思是逆高斯分布(en.wikipedia.org/wiki/Inverse_Gaussian_distribution),或正态分布(累积分布函数的反函数en.wikipedia.org/wiki/Normal_distribution),或其他什么东西?
沃伦·韦克瑟

@WarrenWeckesser第二个:正态分布的累积分布函数的反函数
Yueyoum 2013年

@WarrenWeckesser我的意思是在Excel中使用“ normsinv”函数的python版本。
Yueyoum 2013年

Answers:


127

NORMSINV(在注释中提到)是标准正态分布的CDF的倒数。使用scipy,您可以使用对象的ppf方法进行计算scipy.stats.norm。首字母缩写词ppf代表百分点函数,它是分位数函数的另一个名称。

In [20]: from scipy.stats import norm

In [21]: norm.ppf(0.95)
Out[21]: 1.6448536269514722

检查它是否与CDF相反:

In [34]: norm.cdf(norm.ppf(0.95))
Out[34]: 0.94999999999999996

默认情况下,norm.ppf使用mean = 0和stddev = 1,这是“标准”正态分布。您可以通过分别指定locscale参数来使用不同的均值和标准差。

In [35]: norm.ppf(0.95, loc=10, scale=2)
Out[35]: 13.289707253902945

如果查看源代码scipy.stats.norm,您会发现该ppf方法最终会调用scipy.special.ndtri。因此,要计算标准正态分布的CDF的倒数,可以直接使用该函数:

In [43]: from scipy.special import ndtri

In [44]: ndtri(0.95)
Out[44]: 1.6448536269514722

24
我一直认为“百分点函数”(ppf)是一个糟糕的名字。统计中的大多数人都只使用“分位数函数”。
William Zhang

15
# given random variable X (house price) with population muy = 60, sigma = 40
import scipy as sc
import scipy.stats as sct
sc.version.full_version # 0.15.1

#a. Find P(X<50)
sct.norm.cdf(x=50,loc=60,scale=40) # 0.4012936743170763

#b. Find P(X>=50)
sct.norm.sf(x=50,loc=60,scale=40) # 0.5987063256829237

#c. Find P(60<=X<=80)
sct.norm.cdf(x=80,loc=60,scale=40) - sct.norm.cdf(x=60,loc=60,scale=40)

#d. how much top most 5% expensive house cost at least? or find x where P(X>=x) = 0.05
sct.norm.isf(q=0.05,loc=60,scale=40)

#e. how much top most 5% cheapest house cost at least? or find x where P(X<=x) = 0.05
sct.norm.ppf(q=0.05,loc=60,scale=40)

6
PS: You can assume 'loc' as 'mean' and 'scale' as 'standard deviation'
Suresh2692

14

Starting Python 3.8, the standard library provides the NormalDist object as part of the statistics module.

It can be used to get the inverse cumulative distribution function (inv_cdf - inverse of the cdf), also known as the quantile function or the percent-point function for a given mean (mu) and standard deviation (sigma):

from statistics import NormalDist

NormalDist(mu=10, sigma=2).inv_cdf(0.95)
# 13.289707253902943

Which can be simplified for the standard normal distribution (mu = 0 and sigma = 1):

NormalDist().inv_cdf(0.95)
# 1.6448536269514715

1
Great tip! This allows me to drop the dependency on scipy, which I needed just for the single stats.norm.ppf method
Jethro Cao
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.