有高原形状的分布吗?


Answers:


53

您可能正在寻找以广义正态(版本1)Subbotin分布或指数幂分布的名称已知的分布。通过位置μ,比例σ和形状β带有pdf )进行参数化

β2σΓ(1/β)exp[(|xμ|σ)β]

如您所见,对于β=1它类似于并收敛于Laplace分布;对于β=2它收敛于正态;当时,其趋于β=均匀分布。

在此处输入图片说明

如果您正在寻找已实现的软件,则可以检查normalp库中的R(Mineo和Ruggieri,2005)。该软件包的优点在于,除其他外,它使用广义的正态分布误差实现回归,即最小化Lp范数。


Mineo,AM和Ruggieri,M.(2005年)。用于指数分布的软件工具:normalp软件包。统计软件杂志,12(4),1-24。


20

@StrongBad的评论是一个非常好的建议。如果正确选择参数,则统一RV和高斯RV的总和可以为您提供所需的准确信息。它实际上有一个相当不错的封闭式解决方案。

该变量的pdf由以下表达式给出:

14a[erf(x+aσ2)erf(xaσ2)]

a is the "radius" of the zero-mean uniform RV. σ is the standard deviation of the zero-mean gaussian RV.

PDFs


3
Reference: Bhattacharjee, G. P., Pandit, S. N. N., and Mohan, R. 1963. Dimensional chains involving rectangular and normal error-distributions. Technometrics, 5, 404–406.
Tim

15

There's an infinite number of "plateau-shaped" distributions.

Were you after something more specific than "in between the Gaussian and the uniform"? That's somewhat vague.

Here's one easy one: you could always stick a half-normal at each end of a uniform:

Density with uniform center and Gaussian tails

You can control the "width" of the uniform relative to the scale of the normal so you can have wider or narrower plateaus, giving a whole class of distributions, which include the Gaussian and the uniform as limiting cases.

The density is:

h2πσe12σ2(xμ+w/2)2Ixμw/2+h2πσIμw/2<xμ+w/2+h2πσe12σ2(xμw/2)2Ix>μ+w/2

where h=11+w/(2πσ)

As σ0 for fixed w, we approach the uniform on (μw/2,μ+w/2) and as w0 for fixed σ we approach N(μ,σ2).

Here are some examples (with μ=0 in each case):

Plot of various examples of this Gaussian-tailed uniform

We might perhaps call this density a "Gaussian-tailed uniform".


1
Ach! I love attending formal balls wearing a Gausian-tailed uniform! ;)
Alexis

7

See my "Devil's tower" distribution in here [1]:

f(x)=0.3334, for |x|<0.9399;
f(x)=0.2945/x2, for 0.9399|x|<2.3242; and
f(x)=0, for 2.3242|x|.

Devil's tower density function with flat top, convex sides, cut off at extremes

The "slip-dress"distribution is even more interesting.

It is easy to construct distributions having whatever shape you want.

[1]: Westfall, P.H. (2014)
"Kurtosis as Peakedness, 1905 – 2014. R.I.P."
Am. Stat. 68(3): 191–195. doi:10.1080/00031305.2014.917055
public access pdf: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4321753/pdf/nihms-599845.pdf


Hi Peter -- I took the liberty of giving the function and inserting an image as well as giving a full reference. (If memory serves I think Kendall and Stuart giving the details of a similar debunking in their classic text. If I remember correctly - it has been a long while - I believe they also discuss that it's not heavy-tailedness)
Glen_b -Reinstate Monica

Thanks, Glen_b. I never said kurtosis measured what the tail-index numbers measure. Rather, my article proves kurtosis is, for a very broad class of distributions, nearly equal to E(Z^4 * I(|Z| > 1)). Thus, kurtosis clearly tells you nothing about the 'peak,' which is typically found in the range {Z: |Z| <1}. Rather, it is determined mostly by the tails. Call it E(Z^4 * I(|Z| > 1)) if the term "heavy-tailedness" has another meaning.
Peter Westfall

Also, @Glen_b which tail-index are you referring to? There are infinitely many. Tail crossings don't define "tailedness" properly. According to some tail crossing definitions of tail heaviness, N(0,1) is more "heavy-tailed" than .9999*U(-1,1) + .0001*U(-1000,1000), although the latter is obviously more heavy tailed, despite having finite tails. And, BTW, the latter has extremely high kurtosis, unlike N(0,1).
Peter Westfall

I can't find me saying "tail index" anywhere in my comment; I am not quite sure what you're referring to there when you say "which tail-index are you referring to". If you mean the bit about heavy-tailedness the best thing to do is check what Kendall and Stuart actually say; I believe there they actually compare the asymptotic ratio of densities for symmetric standardized variables, but it might have been survivor functions perhaps; the point was theirs, not mine
Glen_b -Reinstate Monica

Strange. Well, in any event, Kendall and Stuart got it wrong. Kurtosis is obviously a measure of tail weight as my theorems prove.
Peter Westfall

5

Lots of nice answers. The solution proffered here has 2 features: (i) that it has a particularly simple functional form, and (ii) that the resulting distribution necessarily produces a plateau-shaped pdf (not just as a special case). I'm not sure if this already has a name in the literature, but absent same, let us call it a Plateau distribution with pdf f(x):

f(x)=k11+x2afor xR

where:

  • parameter a is a positive integer, and
  • k is a constant of integration: k=aπsin(π2a)

Here is a plot of the pdf, for different values of parameter a:

enter image description here

.

As parameter a becomes large, the density tends towards a Uniform(-1,1) distribution. The following plot also compares to a standard Normal (gray dashed):

enter image description here


3

Another one (EDIT: I simplified it now. EDIT2: I simplified it even further, though now the picture doesn't really reflect this exact equation):

f(x)=13αlog(cosh(αa)+cosh(αx)cosh(αb)+cosh(αx))

Clunky, I know, but here I took advantage of the fact that log(cosh(x)) approaches a line as x increases.

Basically you have control over how smooth is the transition (alpha). If a=2 and b=1 I guarantee it's a valid probability density (sums to 1). If you choose other values then you'll have to renormalize it.


Here is some sample code in R:

f = function(x, a, b, alpha){
  y = log((cosh(2*alpha*pi*a)+cosh(2*alpha*pi*x))/(cosh(2*alpha*pi*b)+cosh(2*alpha*pi*x)))
  y = y/pi/alpha/6
  return(y)
}

f is our distribution. Let's plot it for a sequence of x

plot(0, type = "n", xlim = c(-5,5), ylim = c(0,0.4))
x = seq(-100,100,length.out = 10001L)

for(i in 1:10){
  y = f(x = x, a = 2, b = 1, alpha = seq(0.1,2, length.out = 10L)[i]); print(paste("integral =", round(sum(0.02*y), 3L)))
  lines(x, y, type = "l", col = rainbow(10, alpha = 0.5)[i], lwd = 4)
}
legend("topright", paste("alpha =", round(seq(0.1,2, length.out = 10L), 3L)), col = rainbow(10), lwd = 4)

Console output:

#[1] "integral = 1"
#[1] "integral = 1"
#[1] "integral = 1"
#[1] "integral = 1"
#[1] "integral = 1"
#[1] "integral = 1"
#[1] "integral = 1"
#[1] "integral = NaN" #I suspect underflow, inspecting the plots don't show divergence at all
#[1] "integral = NaN"
#[1] "integral = NaN"

And plot:

My distribution based on log cosh

You could change a and b, approximately the start and end of the slope respectively, but then further normalization would be needed, and I didn't calculate it (that's why I'm using a = 2 and b = 1 in the plot).


2

If you are looking for something very simple, with a central plateau and the sides of a triangle distribution, you can for instance combine N triangle distributions, N depending on the desired ratio between the plateau and the descent. Why triangles, because their sampling functions already exist in most languages. You randomly sort from one of them.

In R that would give:

library(triangle)
rplateau = function(n=1){
  replicate(n, switch(sample(1:3, 1), rtriangle(1, 0, 2), rtriangle(1, 1, 3), rtriangle(1, 2, 4)))
}
hist(rplateau(1E5), breaks=200)

enter image description here enter image description here


2

Here's a pretty one: the product of two logistic functions.

(1/B) * 1/(1+exp(A*(x-B))) * 1/(1+exp(-A*(x+B)))

This has the benefit of not being piecewise.

B adjusts the width and A adjusts the steepness of the drop off. Shown below are B=1:6 with A=2. Note: I haven't taken the time to figure out how to properly normalize this.

Plateau distribution

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.