在N次成功之前，我该如何模拟翻转？

你和我决定玩一个游戏，大家轮流掷硬币。第一位总共翻转10个头的玩家将赢得比赛。自然，关于谁应该先走有一个争论。

此游戏的模拟结果显示，前一个掷骰的玩家比第二个掷骰的玩家赢6％（第一个掷骰的玩家大约有53％的时间获胜）。我有兴趣对此进行建模分析。

这不是二项式随机变量，因为没有固定的试验次数（直到有人得到10个脑袋时才翻转）。我该如何建模？它是负二项式分布吗？

为了能够重新创建我的结果，这是我的python代码：

import numpy as np
from numba import jit


@jit
def sim(N):

    P1_wins = 0
    P2_wins = 0

    for i in range(N):

        P1_heads = 0
        P2_heads = 0
        while True:

            P1_heads += np.random.randint(0,2)

            if P1_heads == 10:
                P1_wins+=1
                break

            P2_heads+= np.random.randint(0,2)
            if P2_heads==10:
                P2_wins+=1
                break
    return P1_wins/N, P2_wins/N


a,b = sim(1000000)

— 德米特里Pananos
source

当你掷硬币直到

r

$r$ 故障，然后看看在完成这样的实验在这之前发生成功的次数的分布，那么这是通过定义负二项分布。

— 蒂姆

我无法复制2％的值。我发现的第一个球员赢得

53.290977425133892 \dots %

$53.290977425133892\ldots\%$ 的时间。

— whuber

@whuber是的，我相信你是对的。我进行模拟的次数少于应有的次数。我的结果与你的相称。

— 德米特里Pananos

如果一个人有53％的时间获胜，那么另一个人应该是47％，那么描述中是否不应显示“第一个玩家比第二个玩家赢6％多”或“一半以上的玩家赢3％”？不（按照目前的说法）“比第二名多3％”

— JesseM

你收到来自FiveThirtyEight这个问题谜语快车？

— foutandabout

Answers:

尾的数量的实现之前的分布的头是负二项与参数和。让是概率函数和生存函数：每个，是玩家的机会前尾巴头和是玩家的机会之前或更多的尾巴头。 $10$ $10$ $1/2$ $f$ $G$ $n\ge 0$ $f(n)$ $n$ $10$ $G(n)$ $n$ $10$

因为玩家独立地滚动，机会第一玩家获胜与恰好滚动尾部被由第二玩家辊上的机会，机会相乘得到或多个尾部，等于。 $n$ $n$ $f(n)G(n)$

求和所有可能给人的第一玩家的获胜几率为 $n$

\sum_{ñ = 0}^{\infty} F （ ñ ） G （ ñ ） \approx 53.290977425133892 \dots ％ 。

$\sum_{n=0}^\infty f(n)G(n) \approx 53.290977425133892\ldots\%.$

大约是一半时间的。 $3\%$

通常，用任何正整数代替，答案可以根据超几何函数给出：它等于 $10$ $m$

1 / 2 + 2^{- 2 m - 1}_{2} F_{1} (m, m, 1, 1 / 4) .

$1/2 + 2^{-2m-1} {_2F_1}(m,m,1,1/4).$

当使用偏头硬币的机会为下，这可以概括为 $p$

\frac{1}{2} + \frac{1}{2} (p^{2 m})_{2} F_{1} (m, m, 1, (1 - p)^{2}) .

$\frac{1}{2} + \frac{1}{2}(p^{2m}) {_2F_1}(m, m, 1, (1 - p)^2).$

这是R一百万个此类游戏的模拟。报告的估计值为。将其与理论结果进行比较的二项式假设检验的Z得分为，差异不明显。 $0.5325$ $-0.843$

n.sim <- 1e6
set.seed(17)
xy <- matrix(rnbinom(2*n.sim, 10, 1/2), nrow=2)
p <- mean(xy[1,] <= xy[2,])
cat("Estimate:", signif(p, 4), 
    "Z-score:", signif((p - 0.532909774) / sqrt(p*(1-p)) * sqrt(n.sim), 3))

— ub
source

就像乍看之下可能并不明显一样，我们的答案在数值上也一致：（.53290977425133892-.5）* 2本质上是我给出的概率。

— Dougal

@Dougal感谢您指出这一点。我查看了您的答案，看到

，并且知道它与问题中要求的答案的形式不一致，因此我不知道您的计算正确。通常，最好以要求的形式将任何问题的答案框起来，如果可能的话：这样可以很容易地识别出正确的时间，并且可以轻松地比较答案。

6.6 %

$6.6\%$

— whuber

@whuber我在回应短语“此游戏的模拟显示，先翻转的玩家比第二翻转的玩家赢2％（编辑：模拟更多游戏后，赢3％）”。我将“多赢2％”解释为

；正确的值确实是6.6％。我不确定有什么方法可以解释“多赢2％”的意思是“有52％的机会赢”，尽管这显然是我们的初衷。

Pr (A wins) - Pr (B wins) = 2 %

$\Pr(A\text{ wins}) - \Pr(B\text{ wins}) = 2\%$

— Dougal

@Dougal我同意OP的描述令人困惑甚至错误。但是，代码和他的结果清楚地表明，他的意思是“比一半时间多出3％”，而不是“比其他玩家多3％”。

— Whuber

@whuber同意。不幸的是，我在发布代码之前回答了这个问题，并且自己没有运行模拟。:)

— Dougal

我们可以像这样对游戏进行建模：

玩家A反复翻转硬币，得到结果 $A_1, A_2, \dots$ ，直到他们得到共计10头。让10头时指数是随机变量 $X$ 。
玩家B也是一样。设第10个磁头的时间索引为随机变量 $Y$ ，它是 $X$ 的iid副本。
如果 $X \le Y$ ，玩家A获胜; 否则，玩家B获胜。即， $\begin{aligned} Pr (A wins) & = Pr (X \geq Y) = Pr (X > Y) + Pr (X = Y) \\ Pr (B wins) & = Pr (Y > X) = Pr (X > Y) . \end{aligned}$ $\begin{align} \Pr(A\text{ wins})&= \Pr(X \ge Y) = \Pr(X > Y) + \Pr(X = Y)\\ \Pr(B\text{ wins})&= \Pr(Y > X) = \Pr(X > Y). \end{align}$

因此，胜率的差距为

Pr (X = Y) = \sum_{k} Pr (X = k, Y = k) = \sum_{k} Pr (X = k)^{2} .

$\Pr(X = Y) = \sum_k \Pr(X = k, Y = k) = \sum_k \Pr(X = k)^2 .$

正如您所怀疑的那样， $X$ （和 $Y$ ）基本上是根据负二项分布分布的。表示法各不相同，但在Wikipedia的参数化中，正面为“失败”，背面为“成功”。在实验停止之前，我们需要 $r = 10$ “失败”（正面），成功概率 $p = \tfrac12$ 。然后，“成功”的数量为 $X - 10$ ，具有

Pr (X - 10 = k) = (\binom{k + 9}{k}) 2^{- 10 - k},

$\Pr(X - 10 = k) = \binom{k + 9}{k} 2^{-10 - k},$ and the collision probability is

Pr (X = Y) = \sum_{k = 0}^{\infty} {(\binom{k + 9}{k})}^{2} 2^{- 2 k - 20},

$\Pr(X = Y) = \sum_{k=0}^\infty \binom{k + 9}{k}^2 2^{-2k - 20} ,$ which Mathematica helpfully tells us is

\frac{76 499 525}{1 162 261 467} \approx 6.6 %

$\frac{76\,499\,525}{1\,162\,261\,467} \approx 6.6\%$ .

Thus Player B's win rate is $\Pr(Y > X) \approx 46.7\%$ , and Player A's is $\frac{619\,380\,496}{1\,162\,261\,467} \approx 53.3\%$ .

— Dougal
source

the heads need not be in a row, just 10 total. I assume that is what you are fixing.

— Demetri Pananos

(+1) I like this approach better than the one I posted because it is computationally simpler: it requires only the probability function, which has a simple expression in terms of binomial coefficients.

— whuber

I've submitted an edit replacing the last paragraph questioning the difference from the other answer with an explanation of how their results are actually the same.

— Monty Harder

Let $E_{ij}$ be the event that the player on roll flips i heads before the other player flips j heads, and let $X$ be the first two flips having sample space $\{ hh,ht,th,tt\}$ where h means heads and t tails, and let $p_{ij} \equiv Pr(E_{ij})$ .

Then $p_{ij}=Pr(E_{i-1j-1}|X=hh)*Pr(X=hh)+Pr(E_{i-1j}|X=ht)*Pr(X=ht)+Pr(E_{ij-1}|X=th)*Pr(X=th)+Pr(E_{ij}|X=tt)*Pr(X=tt)$

Assuming a standard coin $Pr(X=*)=1/4$ means that $p_{ij}=1/4*[p_{i-1j-1}+p_{i-1j}+p_{ij-1}+p_{ij}]$

solving for $p_{ij}$ , $= 1/3*[p_{i-1j-1}+p_{i-1j}+p_{ij-1}]$

But $p_{0j}=p_{00}=1$ and $p_{i0}=0$ , implying that the recursion fully terminates. However, a direct naive recursive implementation will yield poor performance because the branches intersect.

An efficient implementation will have complexity $O(i*j)$ and memory complexity $O(min(i,j))$ . Here's a simple fold implemented in Haskell:

Prelude> let p i j = last. head. drop j $ iterate ((1:).(f 1)) start where
  start = 1 : replicate i 0;
  f c v = case v of (a:[]) -> [];
                    (a:b:rest) -> sum : f sum (b:rest) where
                     sum = (a+b+c)/3 
Prelude> p 0 0
1.0
Prelude> p 1 0
0.0
Prelude> p 10 10
0.5329097742513388
Prelude>

UPDATE: Someone in the comments above asked whether one was suppose to roll 10 heads in a row or not. So let $E_{kl}$ be the event that the player on roll flips i heads in a row before the other player flips i heads in a row, given that they already flipped k and l consecutive heads respectively.

Proceeding as before above, but this time conditioning on the first flip only, $p_{k,l} = 1-1/2*[p_{l,k+1}+p_{l,0}]$ where $p_{il}=p_{ii}=1, p_{ki}=0$

This is a linear system with $i^2$ unknowns and one unique solution.

To convert it into an iterative scheme, simply add an iterate number $n$ and a sensitivity factor $\epsilon$ :

$p_{k,l,n+1} = 1/(1+\epsilon)*[\epsilon*p_{k,l,n} +1-1/2*(p_{l,k+1,n}+p_{l,0,n})]$

Choose $\epsilon$ and $p_{k,l,0}$ wisely and run the iteration for a few steps and monitor the correction term.

— John Rambo
source