The Intuition behind Bayes Theorem

Bayes Theorem is well-known in law of probability. Mathematically, you could write it as:
P(A|B)=P(A and B)/Pr(B) = P(B|A)*P(A)/P(B).

An interesting interview in Scientific American with Decision theorist Eliezer Yudkowsky explains Bayes Theorem more intuitively.

I might answer that Bayes’s Theorem is a kind of Second Law of Thermodynamics for cognition. If you obtain a well-calibrated posterior belief that some proposition is 99% probable, whether that proposition is milk being available at the supermarket or global warming being anthropogenic, then you must have processed some combination of sufficiently good priors and sufficiently strong evidence. That’s not a normative demand, it’s a law. In the same way that a car can’t run without dissipating entropy, you simply don’t get an accurate map of the world without a process that has Bayesian structure buried somewhere inside it, even if the process doesn’t explicitly represent probabilities or likelihood ratios. You had strong-enough evidence and a good-enough prior or you wouldn’t have gotten there.

On a personal level, I think the main inspiration Bayes has to offer us is just the fact that there are rules, that there are iron laws that govern whether a mode of thinking works to map reality. Mormons are told that they’ll know the truth of the Book of Mormon through feeling a burning sensation in their hearts. Let’s conservatively set the prior probability of the Book of Mormon at one to a billion (against). We then ask about the likelihood that, assuming the Book of Mormon is false, someone would feel a burning sensation in their heart after being told to expect one. If you understand Bayes’s Rule you can see at once that the improbability of the evidence is not commensurate with the improbability of the hypothesis it’s trying to lift.

To give some other examples. Let’s say that you are a kid and you are waiting for your mom to pick you up. Let’s say that your mom has picked you up from school every day for the past 2 years between 0 and 5 minutes after the school day has ended. If your mom is 1 minute late, what is the probability your mom is not going to pick you up? The answer is probably pretty low, given that your prior your expected probability (your prior) of your mom picking you up is so strong.

On the other hand, let’s say your mom is very irresponsible and over the past 2 years will pick you up between 0 and 5 minutes after the school day has ended half the time, and the other half of the time she forgets to show up and you have to walk home. In this case, a 1 minute delay in your mom’s arrival would be much more convincing evidence that your mom will not show up because your expected probability (your prior) was much lower.

Thus, with the same current data available (i.e., your mom is 1 minute late), Bayes Theorem tells you that you should not be worried if your mom has a great track record, but a bit more worried if over the past 2 years your mom has been a bit of a deadbeat.

Leave a Reply

Your email address will not be published. Required fields are marked *