回归均值vs赌徒的谬误

29

一方面，我具有对均值的回归，另一方面，我具有赌徒的谬误。

Miller和Sanjurjo（2019）将赌徒的谬误定义为“错误地认为随机序列具有系统性的逆转趋势，即类似结果的条纹更有可能结束而不是持续。”例如，一枚掉头的硬币在下一次审判中，连续几次被认为很有可能落伍。

根据上次的平均值回归，我在上一场比赛中表现不错，而在下一场比赛中，我的表现可能会更差。

但是根据赌徒的谬误：假设硬币是公平的，请考虑以下两个概率

20头的概率，然后1尾= $0.5^{20} × 0.5 = 0.5^{21}$
20头的概率，则1头= $0.5^{20} × 0.5 = 0.5^{21}$

然后...

考虑一个简单的例子：一类学生对一个主题进行100项对/错测试。假设所有学生在所有问题上随机选择。然后，每个学生的分数将是一组独立且均匀分布的随机变量中的一个的实现，预期均值为50。

自然，偶然地，有些学生的分数将大大高于50，而有些分数将大大低于50。如果一个人只拿得分最高的10％的学生，然后再给他们第二次测试，然后他们再次在所有项目上随机选择，那么平均得分将再次接近50。

因此，这些学生的均值将一直“回归”到所有参加原始考试的学生的均值。无论学生在原始考试中得分是多少，他们在第二项考试中得分的最佳预测是50。

特殊情况下，如果只拿得分最高的10％的学生，然后再给他们第二次测试，然后他们再次在所有项目上随机选择，则平均得分将再次接近50。

根据赌徒的谬论，难道不应该期望得分的可能性相同，而不一定要接近50吗？

Miller，JB和Sanjurjo，A.（2019）。当样本量被忽略时，经验如何确定赌徒的谬误。

— 路易斯·P。
source

5

我看不到“赌徒的谬论”如何与您计算的两个概率联系在一起。您能否更准确地解释一下您对这种谬论的理解？

— ub

您的游戏是否有最长的正面顺序？

— AdamO '16

1

我真的很喜欢对此的解释。到目前为止，答案似乎还没有为我解决。回归到均值似乎使独立事件具有依赖性。也许对均值的回归永远不能仅用于一个观察，它仅在存在均值时适用。

— icc97 '16

28

我认为可以通过考虑“回归均值”的概念与过去完全无关来解决混乱。只是重言式的观察，在每次实验的迭代中，我们都期望平均结果。因此，如果我们以前的结果高于平均水平，那么我们预期结果会更糟；如果我们的结果低于平均水平，则我们期望结果会更好。关键是，期望本身并不像赌徒的谬论那样依赖任何先前的历史。

— 达克斯顿
source

究竟。在此问题中，如果将负责人解释为“好结果”，则在OP中，在一连串好的结果之后可能会出现更糟的结果，而在一连串的坏结果之后可能会出现更好的结果。

— 变形虫说恢复莫妮卡

5

看来您在矛盾自己。您声明the expectation itself does not depend on any previous history和if we previously had an above average outcome then we expect a worse result。您在两个地方都使用了期望一词，并且在两个地方都谈论过往/以前的历史。

— 埃里克

6

没有矛盾。我们不期望结果更糟，因为结果实际上是相互依赖的，我们期望结果更糟，因为我们看到结果超出了我们的预期。期望本身是恒定的，不会因为看到先前的结果而改变。

— dsaxton，2016年

@Erik也许改写可能会有所帮助，但要注意的一点是如何区分这两个方面。第一，我们希望获得平均结果，或者更可能是相信它。与实际结果进行比较时，该期望可能相对好坏，具体取决于结果相对于我们期望的好坏。我们没有关于未来的任何信息！我们只是将实际结果与平均值进行比较。（此评论现在已

— 不再赘述

9

不赞成，因为您的答案首先就模棱两可。也就是说，在高于平均水平的结果之后，“更差”的结果是什么？OP将其解释为“比平均水平差”（由于公正的世界谬误，这种解释在直觉上是正确的），而回归到均值意味着它将“比历史更差”。在没有消除混乱的根源的情况下，您的（正确）答案仅对那些已经知道正确答案的人可以理解。如果您以某种形式对其进行编辑，您会得到我的认可。

— rumtscho

17

如果您发现自己是一个有理性的人（并假设一个公平的人），那么您最好的选择就是猜测。如果您发现自己处于迷信赌徒的位置，那么最好的选择就是看一下先前发生的事件，并尝试证明自己对过去的推理-例如：“哇，头脑发烫，该花些时间了！” 或“ 没有办法，我们不会再遇到其他麻烦-这种条纹的可能性极低！”。

赌徒的谬论并没有意识到每20枚硬币的每一串都不可能使我们疯狂地投掷 -例如，极不可能翻转10个头然后10个尾巴，非常不可能翻转交替的头和尾，非常不可能分裂成4个，等等。甚至不可能翻转HHTHHTTTHT。因为对于任何字符串，在许多不同的结果中只有一种发生这种情况的方法。因此，将它们中的任何一个混合为“可能”或“不太可能”是谬论，因为它们都是等概率的。

回归均值是正确的信念，即从长远来看，您的观察结果应该收敛到有限的期望值。例如，我敢打赌，抛掷20个硬币中的10个是一个很好的选择，因为有很多方法可以实现。下注20中的15的可能性大大降低，因为达到最终计数的字符串少得多。值得注意的是，如果您围坐一圈并翻转（公平）硬币足够长的时间，最终最终将得到大约50/50的东西-但是您最终将不会得到没有“条纹”或其他不可能的东西事件。这是这两个概念之间区别的核心。

TL; DR：回归均值表示，随着时间的流逝，您最终将获得一个分布，该分布可反映任何实验中的预期结果。赌徒的谬论（错误地）说，硬币的每一次翻转都对先前的结果有记忆，这会影响下一个独立的结果。

— 德里克·詹尼
source

1

那么，赌徒的谬误是一个错误的概念吗？我无法理解要点。抱歉

— Luis P.

6

赌徒的谬论是……谬论。是错误的，这是错误的推理。不过，回归均值是纯粹的统计数据：)

— Derek Janni

1

Regression to the mean is the rightly-founded belief that in the long run, your observations should converge to a finite expected value

-这就是 “赌徒的谬论”-在出现

— 一堆

2

@Izkata不完全是。对均值的回归表明，在进行大量试验后，两侧的条纹应大致均匀，并且您进行的试验越多，获得的真实均值就越接近。如果翻转足够多，可以得到100个头的条纹，则可能还会有一些尾巴的条纹来平衡分布中的某个位置，因为头和尾巴的条纹可能性相同。重要的是，对均值的回归不会对任何特定的基准进行假设，而只会在样本量增加时对合计值进行假设。

— 伊桑

1

@Izkata Gambler的谬论宣称任何特定结果都会发生什么，回归均值就可以概括性地说明我们会从许多结果中期望什么。

— Derek Janni

5

我总是试图记住，回归均值并不是观察异常值的补偿机制。

进行出色的赌博，然后再进行50-50，这之间没有因果关系。记住这是一种有用的方法，当您从分布中进行抽样时，您最有可能看到接近均值的值（想想切比雪夫不等式在这里要说的）。

— ul龙
source

2

耶·切比雪夫（Yay Chebyshev）！好点！

— Derek Janni

4

这是一个简单的例子：您决定扔掉总共200个硬币。到目前为止，您已经扔掉了100个，并且您非常幸运：100％抬起头来（我知道这是令人难以置信的，但是让我们保持简单）。

以100首掷骰中的100首为条件，您预计比赛结束时总共有150首。赌徒谬论的一个极端例子是，即使在前100次掷骰中得到100个头之后，您仍然只希望获得100个头（即开始游戏前的期望值）。赌徒谬误地认为接下来的100次抛掷一定是尾巴。均值回归的一个例子（在这种情况下）是，在您完成游戏后，您的100％的头率将降至150/200 = 75％（即朝着50％的均值）。

— 阿德里安
source

1

@whuber，这不是经典的父子身高示例，但我认为它满足维基百科的定义：“向（或向）均值回归是以下现象：如果变量（例如，抛硬币的小数）在第一次测量中达到极限，它将趋向于在第二次测量中接近平均值”

— Adrian

3

维基百科要小心：它的介绍性语言仅是为了给出一些启发式想法，但很少是定义。实际上，您的报价既不是定义（因为它没有说明“极端”的含义），在大多数解释中也不正确。例如，对于任何连续随机变量恰好有一个

机会的两个独立的试验的第二个是进一步从所述平均值比所述第一。

1 / 2

$1/2$

— 豪伯

1

我认为，对赌徒的谬误和均值回归进行清晰的描述可能比提供示例更为重要。仅给出示例时，尚不清楚应如何理解它们或它们与这两个主题之间的关系。

— 豪伯

1

正如有人谁同样认为对OP，你的第二段是唯一在所有的答案的例子，清楚地解释了区别是什么。现在更有意义了。

— Izkata

1

@whuber这正是大多数其他答案所正在做的，他们并没有为我彻底解决。

— 2016年

2

我可能是错的，但我一直认为区别在于独立性。

在赌徒的谬论中，问题在于对独立的误解。确定要进行N次大量抛硬币，您将获得大约50-50的比例，但是如果偶然的话，那么您就不会认为下一次T抛会帮助消除赔率的想法是错误的，因为每次抛硬币都与以前的。

在使用均值的情况下，对均值的回归是某种想法，即抽奖取决于先前的抽奖或先前计算的平均值/值。例如，让我们使用NBA的投篮命中率。如果球员A在他的职业生涯中平均投篮命中率达到40％，并且在新的一年开始时在前5场比赛中投篮命中率达到70％，那么认为他会回归到职业生涯的平均数是合理的。有一些因素会影响他的比赛：热/冷的条纹，队友的比赛，自信心，以及一个简单的事实，即如果他将这一年保持70％的命中率，他绝对会歼灭多项纪录，这简直是不可能的壮举（根据专业篮筐球手当前的表现能力）。随着您玩更多的游戏，您的投篮命中率可能会降低到您的职业平均水平。

— Marsenau
source

您对均值回归的解释听起来更像是收缩估计量。您能否提供对“回归”实际含义的特定定义？

— ub

I was following the idea of "The phenomenon occurs because student scores are determined in part by underlying ability and in part by chance" from Wikipedia. My understanding is while there is a level of probability, the results are driven by some underlying ability.

— Marsenau

2

Thank you for that clarification. It's not evident how that idea applies to the idea that as one's career progresses, one's average draws closer to the career average. That sounds either like a tautology or some version of a law of large numbers. In fact, it sounds awfully like the Gambler's Fallacy itself!

— whuber

1

Or your career average will rise to meet your new abilities. :) I think it is a mistake to muddy the water with an improvable skill.

— Erik

1

"misunderstanding of independence" - this appears to be the critical point. Regression to the mean appears to make independent events dependent.

— icc97

2

The key is that we don't have any information that will help us with the next event (gambler's fallacy), because the next event isn't dependent on the previous event. We can make a reasonable guess about how a series of trials will go. This reasonable guess is the average aka our expected mean result. So when we watch a deviation in the mean trend back toward the mean, over time/trials, then we witnessing a regression to the mean.

As you can see regression to the mean is an observed series of actions, it isn't a predictor. As more trials are conducted things will more closely approximate a normal/Gaussian distribution. This means that I'm not making any assumptions or guess on what the next result will be. Using the law of large numbers I can theorize that even though things might be trending one way currently, over time things will balance themselves out. When they do balance themselves out the result set has regressed to the mean. It is important to note here that we aren't saying that future trials are dependent on past results. I'm merely observing a change in the balance of the data.

The gambler's fallacy as I understand it is more immediate in it's goals and focuses on prediction of future events. This tracks with what a gambler desires. Typically games of chance are tilted against the gambler over the long term, so a gambler wants to know what the next trial will be because they want to capitalize on this knowledge. This leads the gambler to falsely assume that the next trial is dependent on the previous trial. This can lead to neutral choices like:

The last five times the roulette wheel landed on black, so therefore next time I'm betting big on red.

Or the choice can be self-serving:

I've gotten a full house the last 5 hands, so I'm going to bet big because I'm on a winning streak and can't lose.

So as you can see there are few key differences:

Regression to the mean doesn't assume that independent trials are dependent like the gambler's fallacy.
Regression to the mean is applied over a large amount of data/trials, where the gambler's fallacy is concerned with the next trial.
Regression to the mean describes what has already taken place. Gambler's fallacy attempts to predict the future based on an expected average, and past results.

— Erik
source

1

Actually I don't think that regression to the mean has anything to do with the law of large numbers or that it signifies what you say it does in the first sentence.

— amoeba says Reinstate Monica

@amoeba so if we plan on flipping a coin 100 times and 20 flips into the trial we have 20 heads. At the end of the trial we have 55 heads. I trying to say that this would be an example of "regression to the mean." It started off lop-sided but over time it normalized. The law of large numbers bit was another way of expressing the idea that things will average out over enough trials, which is the same as saying an initial imbalance will balance out over time or regress toward the mean.

— Erik

1

I guess I am starting to get the gist of those themes with your keys, Erik. Beautiful! :) xxx

— Luis P.

2

Are students with higher grades who score worse on retest cheaters?

The question received a substantial edit since the last of six answers.

The edited question contains an example of regression to the mean in the context of student scores on a $100$ question true-false test and an retest for the top performers on an equivalent test. The retest shows substantially more average scores for the group of top performers on the first test. What's going on? Were the students cheating the first time? No, it is important to control for regression to the mean. Test performance for multiple choice tests is a combination of luck in guessing and ability/knowledge. Some portion of the top performers' scores was due to good luck, which was not necessarily repeatable the second time.

Or should they just stay away from the roulette wheel?

Let's first assume that no skill at all was involved, that the student's were just flipping (fair) coins to determine their answers. What's the expected score? Well, each answer has independently a $50\%$ chance of being the correct one, so we expect $50\%$ of $100$ or a score of $50$ .

But, that's an expected value. Some will do better merely by chance. The probability of scoring at least $60\%$ correctly according to the binomial distribution is approximately $2.8\%$ . So, in a group of $3000$ students, the expected number of students to get a grade of $60%$ or better is $85$ .

Now let's assume indeed there were $85$ students with a score of $60\%$ or better and retest them. What's the expected score on retest under the same coin-flipping method? Its still $50\%$ of $100$ ! What's the probability that a student being retested in this manner will score above $60\%$ ? It's still $2.8\%$ ! So we should expect only $2$ of the $85$ ( $2.8\% \cdot 85$ ) to score at least $60\%$ on retest.

Under this setup it is a fallacy to assume an expected score on retest different from the expected score on the first test -- they are both $50\%$ of $100$ . The gambler's fallacy would be to assume that the good luck of the high scoring students is more likely to be balanced out by bad luck on retest. Under this fallacy, you'd bet on the expected retest scores to be below $50$ . The hot-handed fallacy (here) would be to assume that the good luck of the high scoring students is more likely to continue and bet on the expected retest scores to be above $50$ .

Lucky coins and lucky flips

Reality is a bit more complicated. Let's update our model. First, it doesn't matter what the actual answers are if we are just flipping coins, so let's just score by number of heads. So far, the model is equivalent. Now let's assume $1000$ coins are biased to be heads with probability of $55\%$ (good coins $G$ ), $1000$ coins are biased to be heads with probability of $45\%$ (bad coins $B$ ), and $1000$ have equal probability of being heads or tails (fair coins $F$ ) and randomly distribute these. This is analogous to assuming higher and lower ability/knowledge under the test taking example, but it is easier to reason correctly about inanimate objects.

The expected score is $(55 \cdot 1000 + 45 \cdot 1000 + 50 \cdot 1000)/3000 = 50$ for any student given the random distribution. So, the expected score for the first test has not changed. Now, the probability of scoring at least $60\%$ correctly, again using the binomial distribution is $18.3\%$ for good coins, $0.2\%$ for bad coins, and of course $2.8\%$ still for the fair coins. The probability of scoring at least $60\%$ is, since an equal number of each type of coin was randomly distributed, the average of these, or $7.1\%$ . The expected number of students scoring at least $60\%$ correctly is $21$ .

Now, if we do indeed have $21$ scoring at least $60\%$ correctly under this setup of biased coins, what's the expected score on retest? Not $50\%$ of $100$ anymore! Now you can work it out with Bayes theorem, but since we used equal size groups the probability of having a type of coin given a outcome is (here) proportional to the probability of the outcome given the type of coin. In other words, there is a $86\% = 18.3\%/(18.3\% + 0.2\% + 2.8\%)$ chance that those scoring at least 60% had a good coin, $1\% = 0.2\%/(18.3\% + 0.2\% + 2.8\%)$ had a bad coin, and $13\%$ had a fair coin. The expected value of scores on retest is therefore $86\% \cdot 55 + 1\% \cdot 45 + 13\% \cdot 50 = 54.25$ out of $100$ . This is lower than actual scores of the first round, at least $60$ , but higher than the expected value of scores before the first round, $50$ .

So even when some coins are better than others, randomness in the coin flips means that selecting the top performers from a test will still exhibit some regression to the mean in a retest. In this modified model, hot-handedness is no longer an outright fallacy -- scoring better in the first round does mean a higher probability of having a good coin! However, gambler's fallacy is still a fallacy -- those who experienced good luck cannot be expected to be compensated with bad luck on retest.

— A. Webb
source

I've just got an idea. I'm gonna simulate that model and see how it works.

— Luis P.

1

They are saying the same thing. You were mostly confused because no single experiment in the coin flip example has extreme result (H/T 50/50). Change it to "flipping ten fair coins at the same time in every experiment", and gamblers want to get all of them right. Then an extreme measurement would be that you happen to see all of them are heads.

Gambler fallacy: Treat each gamble outcome (coin flipping result) as IID. If you already know the distribution those IID shares, then the next prediction should come directly from the known distribution and has nothing to do with historical (or future) results (aka other IID).

Regression to the mean: Treat each test outcome as IID (since the student is assumed to be guessing randomly and have no real skill). If you already know the distribution those IID shares, then the next prediction comes directly from the known distribution and has nothing to do with historical (or future) results (aka other IID) (exactly as before up to here). But, by CLT, if you observed extreme values in one measurement (e.g by chance you were only sampling the top 10% students from the first test), you should know the result from your next observation/measurement will still be generated from the known distribution (and thus more likely to be closer to the mean than staying at the extreme).

So fundamentally, they both say the next measurement will come from the distribution instead of past results.

— Yey
source

This is not a correct citation of the central limit theorem. It is merely a statement of what an independent event is.

— AdamO

0

Let X and Y be two i.i.d. uniform random variables on [0,1]. Suppose we observe them one after another.

Gambler's Fallacy: P( Y | X ) != P( Y ) This is, of course, nonsense because X and Y are independent.

Regression to the mean: P( Y < X | X = 1) != P( Y < X ) This is true: LHS is 1, LHS < 1

— anonymous
source

0

Thanks your answers I think I could understand the difference between the Regression to the mean and Gambler's fallacy. Even more, I built a database to help me illustrate in the "real" case.

I built this situation: I collected 1000 students and I put them to do a test randomly answering questions .

The test score ranges from 01 to 05. As they are randomly answering questions, so each score has a 20% chance of being achieved. So for the first test the number of students with a score 05 should be something close to 200

(1.1) $1000*0,20$

(1.2) $200$

I Had 196 students with score 05 which is very close to the expected 200 students.

So I put those 196 students repeat the test is exepected 39 students with score 05.

(2.1) $196*0,20$

(2.2) $39$

Well, according to the result I got 42 students which is within the expected.

For those who got score 05 I put them to repeat the test and so and forth...

Therefore, the expected numbers were:

Expected RETEST 03

(3.1) $42*0,20$

(3.2) $8$

(3.3) Outcomes (8)

Expected RETEST 04

(4.1) $8*0,20$

(4.2) $1,2$

(4.3) Outcomes (2)

Expected RETEST 05

(4.1) $2*0,20$

(4.2) $0,1$

(4.3) Outcomes (0)

If I'm expecting for a student who gets score 05 four times I shall to face the probability of $0,20^4$ , i.e, 1,2 student per 1000. However If I expect for a student who gets score 05 five times I should have at least 3.500 samples in order to get 1,12 student with score 05 in all tests

(5.1.) $0,20^5 = 0,00032$

(5.2.) $0,00032 * 3500 = 1.2$

Therefore the probability of the one student gets score 05 in the all 05 tests has nothing to do with his last score, I mean, I must not calculate the probability on the each test singly. I must look for those 05 tests like one event and calculate the probability for that event.

— Luis P.
source