具有模拟功能的重要性抽样低于预期的覆盖率


9

我正在尝试回答R中的重要性抽样评估方法积分问题。基本上,用户需要计算

0πf(x)dx=0π1cos(x)2+x2dx

使用指数分布作为重要性分布

q(x)=λ expλx

并找到的值,该值可以更好地逼近积分(是)。我重铸问题,因为平均值的评价μ˚F X 超过[ 0 π ]:积分然后只是π μλself-studyμf(x)[0,π]πμ

因此,让是的PDF X ù0 π ,并且让ÿ ˚F X :现在的目标是估计p(x)XU(0,π)Yf(X

μ=E[Y]=E[f(X)]=Rf(x)p(x)dx=0π1cos(x)2+x21πdx

使用重要性抽样。我在R中进行了仿真:

# clear the environment and set the seed for reproducibility
rm(list=ls())
gc()
graphics.off()
set.seed(1)

# function to be integrated
f <- function(x){
    1 / (cos(x)^2+x^2)
}

# importance sampling
importance.sampling <- function(lambda, f, B){
    x <- rexp(B, lambda) 
    f(x) / dexp(x, lambda)*dunif(x, 0, pi)
}

# mean value of f
mu.num <- integrate(f,0,pi)$value/pi

# initialize code
means  <- 0
sigmas <- 0
error  <- 0
CI.min <- 0
CI.max <- 0
CI.covers.parameter <- FALSE

# set a value for lambda: we will repeat importance sampling N times to verify
# coverage
N <- 100
lambda <- rep(20,N)

# set the sample size for importance sampling
B <- 10^4

# - estimate the mean value of f using importance sampling, N times
# - compute a confidence interval for the mean each time
# - CI.covers.parameter is set to TRUE if the estimated confidence 
#   interval contains the mean value computed by integrate, otherwise
# is set to FALSE
j <- 0
for(i in lambda){
    I <- importance.sampling(i, f, B)
    j <- j + 1
    mu <- mean(I)
    std <- sd(I)
    lower.CB <- mu - 1.96*std/sqrt(B)  
    upper.CB <- mu + 1.96*std/sqrt(B)  
    means[j] <- mu
    sigmas[j] <- std
    error[j] <- abs(mu-mu.num)
    CI.min[j] <- lower.CB
    CI.max[j] <- upper.CB
    CI.covers.parameter[j] <- lower.CB < mu.num & mu.num < upper.CB
}

# build a dataframe in case you want to have a look at the results for each run
df <- data.frame(lambda, means, sigmas, error, CI.min, CI.max, CI.covers.parameter)

# so, what's the coverage?
mean(CI.covers.parameter)
# [1] 0.19

该代码基本上是重要性采样的简单实现,遵循此处使用的表示法。然后将重要性采样重复次,以获取μ的多个估计值,并且每次检查95%的间隔是否覆盖实际均值。Nμ

如您所见,对于,实际覆盖率仅为0.19。将B增加到10 6这样的值无济于事(覆盖范围甚至更小,为0.15)。为什么会这样呢?λ=20B106


1
可以将无限支撑重要性函数用于有限支撑积分并不是最佳方法,因为可以将模拟的一部分用于模拟零。至少截断处的指数,这很容易做到和模拟。π
西安

@西安可以肯定的是,我同意,如果必须通过重要性抽样评估该积分,则不会使用该重要性分布,但是我试图回答最初的问题,该问题需要使用指数分布。我的问题是,即使此方法远非最佳,但覆盖范围仍应(平均)随着。这就是Greenparker的表现。
DeltaIV

Answers:


3

重要性抽样对重要性分布的选择非常敏感。由于选择了时,样品所绘制使用将有一个平均的1 / 20与方差1 / 400。这就是你得到的分布λ=20rexp1个/201个/400

在此处输入图片说明

但是,要评估的积分从0到。因此,您想使用一个给您这样范围的λ。我用λ = 1π=3.14λλ=1个

在此处输入图片说明

λ=1个ππλ=1个

# clear the environment and set the seed for reproducibility
rm(list=ls())
gc()
graphics.off()
set.seed(1)

# function to be integrated
f <- function(x){
  1 / (cos(x)^2+x^2)
}

# importance sampling
importance.sampling <- function(lambda, f, B){
  x <- rexp(B, lambda) 
  f(x) / dexp(x, lambda)*dunif(x, 0, pi)
}

# mean value of f
mu.num <- integrate(f,0,pi)$value/pi

# initialize code
means  <- 0
sigmas <- 0
error  <- 0
CI.min <- 0
CI.max <- 0
CI.covers.parameter <- FALSE

# set a value for lambda: we will repeat importance sampling N times to verify
# coverage
N <- 100
lambda <- rep(1,N)

# set the sample size for importance sampling
B <- 10^4

# - estimate the mean value of f using importance sampling, N times
# - compute a confidence interval for the mean each time
# - CI.covers.parameter is set to TRUE if the estimated confidence 
#   interval contains the mean value computed by integrate, otherwise
# is set to FALSE
j <- 0
for(i in lambda){
  I <- importance.sampling(i, f, B)
  j <- j + 1
  mu <- mean(I)
  std <- sd(I)
  lower.CB <- mu - 1.96*std/sqrt(B)  
  upper.CB <- mu + 1.96*std/sqrt(B)  
  means[j] <- mu
  sigmas[j] <- std
  error[j] <- abs(mu-mu.num)
  CI.min[j] <- lower.CB
  CI.max[j] <- upper.CB
  CI.covers.parameter[j] <- lower.CB < mu.num & mu.num < upper.CB
}

# build a dataframe in case you want to have a look at the results for each run
df <- data.frame(lambda, means, sigmas, error, CI.min, CI.max, CI.covers.parameter)

# so, what's the coverage?
mean(CI.covers.parameter)
#[1] .95

λ

编辑 - - - -

=104=106ñ=100=104

.19±1.96.191个-.19100=.19±.0769=.1131.2669

=106

ñ=100ñ=1000=104=106.158

.123±1.96.1231个-.1231000=.123±.0203=.102.143

ñ=1000


λ0.1<λ<2λλ=20104106λ
DeltaIV '17

1
ñ=100

1
ñ=1000
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.