与MCMC Metropolis-Hastings变化相混淆：随机行走，非随机行走，独立，都会

在过去的几周中，我一直在尝试了解MCMC和Metropolis-Hastings算法。每当我认为自己理解时，我就会意识到自己错了。我在网上找到的大多数代码示例都实现了与描述不一致的内容。即：他们说他们实施了Metropolis-Hastings，但实际上是实施了随机漫步的城市。其他人（几乎总是）默默地跳过黑斯廷斯校正率的实现，因为他们使用的是对称提案分配。实际上，到目前为止，我还没有找到一个简单的示例来计算比率。这让我更加困惑。有人可以给我以下代码示例（任何语言）：

具有Hastings校正比率计算的Vanilla非随机步行Metropolis-Hastings算法（即使使用对称投标分布最终将为1）。
Vanilla Random Walk Metropolis-Hastings算法。
Vanilla Independent Metropolis-Hastings算法。

无需提供Metropolis算法，因为如果我没有记错的话，Metropolis和Metropolis-Hastings之间的唯一区别是，第一个总是从对称分布中采样，因此它们没有黑斯廷斯校正率。无需详细说明算法。我确实了解基本知识，但是对于Metropolis-Hastings算法的不同变体，我对所有不同的名称感到困惑，但对于在Vanilla非随机行走MH上实际实现黑斯廷斯校正率的方式，我还是感到困惑。请不要复制粘贴链接以部分回答我的问题，因为很可能我已经看过它们。这些联系使我感到困惑。谢谢。

mcmc metropolis-hastings

— AstrOne
source

Answers:

在这里-三个示例。为了使逻辑更清晰，我使代码的效率比实际应用程序低得多。

# We'll assume estimation of a Poisson mean as a function of x
x <- runif(100)
y <- rpois(100,5*x)  # beta = 5 where mean(y[i]) = beta*x[i]

# Prior distribution on log(beta): t(5) with mean 2 
# (Very spread out on original scale; median = 7.4, roughly)
log_prior <- function(log_beta) dt(log_beta-2, 5, log=TRUE)

# Log likelihood
log_lik <- function(log_beta, y, x) sum(dpois(y, exp(log_beta)*x, log=TRUE))

# Random Walk Metropolis-Hastings 
# Proposal is centered at the current value of the parameter

rw_proposal <- function(current) rnorm(1, current, 0.25)
rw_p_proposal_given_current <- function(proposal, current) dnorm(proposal, current, 0.25, log=TRUE)
rw_p_current_given_proposal <- function(current, proposal) dnorm(current, proposal, 0.25, log=TRUE)

rw_alpha <- function(proposal, current) {
   # Due to the structure of the rw proposal distribution, the rw_p_proposal_given_current and
   # rw_p_current_given_proposal terms cancel out, so we don't need to include them - although
   # logically they are still there:  p(prop|curr) = p(curr|prop) for all curr, prop
   exp(log_lik(proposal, y, x) + log_prior(proposal) - log_lik(current, y, x) - log_prior(current))
}

# Independent Metropolis-Hastings
# Note: the proposal is independent of the current value (hence the name), but I maintain the
# parameterization of the functions anyway.  The proposal is not ignorable any more
# when calculation the acceptance probability, as p(curr|prop) != p(prop|curr) in general.

ind_proposal <- function(current) rnorm(1, 2, 1) 
ind_p_proposal_given_current <- function(proposal, current) dnorm(proposal, 2, 1, log=TRUE)
ind_p_current_given_proposal <- function(current, proposal) dnorm(current, 2, 1, log=TRUE)

ind_alpha <- function(proposal, current) {
   exp(log_lik(proposal, y, x)  + log_prior(proposal) + ind_p_current_given_proposal(current, proposal) 
       - log_lik(current, y, x) - log_prior(current) - ind_p_proposal_given_current(proposal, current))
}

# Vanilla Metropolis-Hastings - the independence sampler would do here, but I'll add something
# else for the proposal distribution; a Normal(current, 0.1+abs(current)/5) - symmetric but with a different
# scale depending upon location, so can't ignore the proposal distribution when calculating alpha as
# p(prop|curr) != p(curr|prop) in general

van_proposal <- function(current) rnorm(1, current, 0.1+abs(current)/5)
van_p_proposal_given_current <- function(proposal, current) dnorm(proposal, current, 0.1+abs(current)/5, log=TRUE)
van_p_current_given_proposal <- function(current, proposal) dnorm(current, proposal, 0.1+abs(proposal)/5, log=TRUE)

van_alpha <- function(proposal, current) {
   exp(log_lik(proposal, y, x)  + log_prior(proposal) + ind_p_current_given_proposal(current, proposal) 
       - log_lik(current, y, x) - log_prior(current) - ind_p_proposal_given_current(proposal, current))
}


# Generate the chain
values <- rep(0, 10000) 
u <- runif(length(values))
naccept <- 0
current <- 1  # Initial value
propfunc <- van_proposal  # Substitute ind_proposal or rw_proposal here
alphafunc <- van_alpha    # Substitute ind_alpha or rw_alpha here
for (i in 1:length(values)) {
   proposal <- propfunc(current)
   alpha <- alphafunc(proposal, current)
   if (u[i] < alpha) {
      values[i] <- exp(proposal)
      current <- proposal
      naccept <- naccept + 1
   } else {
      values[i] <- exp(current)
   }
}
naccept / length(values)
summary(values)

对于香草采样器，我们得到：

> naccept / length(values)
[1] 0.1737
> summary(values)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  2.843   5.153   5.388   5.378   5.594   6.628

这是低的接受概率，但仍然...调整建议会有所帮助，或者采用其他建议。这是随机步行建议的结果：

> naccept / length(values)
[1] 0.2902
> summary(values)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  2.718   5.147   5.369   5.370   5.584   6.781

就像人们希望的那样，结果相似，并且具有更好的接受概率（使用一个参数即可达到〜50％）。

为了完整起见，独立采样器：

> naccept / length(values)
[1] 0.0684
> summary(values)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  3.990   5.162   5.391   5.380   5.577   8.802

因为它不“适应”后部的形状，所以它倾向于具有最差的接受概率，并且最难对此问题进行良好的调整。

请注意，一般而言，我们更喜欢尾巴较粗的提案，但这是另一个主题。

— 鲍伯曼
source

Q

$Q$

@floyd-在许多情况下很有用，例如，如果您对分布中心的位置有一个不错的想法（例如，因为您已经计算了MLE或MOM估计值），并且可以选择一个胖尾的建议分布，或者每次迭代的计算时间很短，在这种情况下，您可以运行很长的链（弥补了较低的接受率），从而节省了分析和编程时间，这可能甚至比效率低下的运行时间还要长。但是，这不是典型的首次尝试建议，而是随机游走。

— jbowman

Q

$Q$

p (x_{t + 1} | x_{t})

$p(x_{t+1}|x_t)$

p (x_{t + 1} | x_{t}) = p (x_{t + 1})

$p(x_{t+1}|x_t) = p(x_{t+1})$

看到：

$q()$ ${\bf x}$

在维基百科的文章是一个很好的互补读。如您所见，Metropolis也具有“更正率”，但是，如上所述，Hastings进行了修改，允许非对称提案分配。

Metropolis算法mcmc在命令R的R包中实现metrop()。

其他代码示例：

http://www.mas.ncl.ac.uk/~ndjw1/teaching/sim/metrop/

http://pcl.missouri.edu/jeff/node/322

http://darrenjw.wordpress.com/2010/08/15/metropolis-hastings-mcmc-algorithms/

— 弗里茨·朗
source

谢谢您的回复。不幸的是，它没有回答我的任何问题。我只看到随机行走的大都市，非随机行走的大都市和独立的MH。dnorm(can,mu,sig)/dnorm(x,mu,sig)第一条链接的独立性采样器中的黑斯廷斯校正率不等于1。我认为使用对称提议分布时应该假定其等于1。这是因为这是一个独立采样器，而不是普通的非随机行走MH吗？如果是，一个普通的非随机行走MH的黑斯廷斯比率是多少？

— AstrOne

p (current | proposal) = p (proposal | current)

$p(\text{current}|\text{proposal}) = p(\text{proposal}|\text{current})$