Improving our beliefs: Bayes' Theorem

Heurística Lab
May 3, 2024
8 min read

Updated: May 6, 2024

How to make better decisions under uncertainty (Part 2)

By Pedro Del Carpio.

In the article “Molding Uncertainty”, the first part of the How to make better decisions under uncertainty series, I asked the readers to imagine themselves in a situation in which they have to decide if quitting their job to embark on what seems to be an attractive business venture is the right course of action to take.

We used the Subjective Expected Utility (SEU) model and determined that, for this example, getting on board a new business project was the alternative with the highest utility. In other words, based on our subjective estimation of probabilities and outcomes, we can determine the decisions that will bring us more satisfaction [i].

Thus, being able to correctly predict probabilities is a fundamental requirement when using the SEU method. In practical terms, this means that estimating the wrong probabilities can lead to costly and incorrect decisions. Despite its critical importance, it’s not obvious how we can improve our ability to calculate probabilities, especially when dealing with the constant incoming of unknown information.

Therefore, continuing our quest for reducing as much as possible the level of uncertainty of the decisions we face, this article will further develop our decision-making tools by explaining the next key, game-changer concept: The Bayes’ Theorem.

Bayes' Theorem

Life feeds us with an unstoppable stream of new events and pieces of information that allow us to update our beliefs about the world.

These facts become invaluable evidence that should be used to improve the quality of our judgments; nevertheless the proper utilization of it is more counterintuitive than you would expect. For example, suppose you are at a party and you meet someone called X, who has a flirty attitude towards you. Do you know how to determine if that person wants to have a fling with you?

I guess you find this question intriguing because although it deals with a quite frivolous situation, answering it with precision is not straightforward. Fortunately we have the Bayes’ Theorem, the best way to “decode” and solve this kind of challenge.

fórmula del teorema de bayes, behavioral sciences tool — Figure 1: Bayes' Theorem

This formula lets us determine the probability of occurrence of a hypothesis given new evidence. Furthermore, it is the mathematical representation of a way of thinking that can improve our understanding of the relationship between what we know and what unfolds around us. And in practice, it can dramatically boost the quality of the decisions we make.

The Bayes’ Theorem can be applied to pretty much any instance of knowledge, undoubtedly becoming one of the most important tools humans have when in need of reducing uncertainty. Among a broad variety of examples, it has been used to determine the probability of having a medical condition after positive test results [1], the outcome of political elections[2], improve machine-learning performance [3] , and even to “prove” [4] and “disprove” [5] the existence of God.

Considering its importance I find it very strange that Bayesian thinking hasn’t reached a mainstream audience; being mainly relegated to the realms of mathematicians, philosophers or statisticians. I think the principal reason for this lack of popularity can be found in the difficulty to understand its mechanism. Although there are plenty of sources with very extensive and sound explanations of the Bayes’ Theorem, I haven’t found any that are sufficiently concrete, intuitive, and adaptable. Here, I will try to fill that gap.

Dissecting Bayes

The powerful Bayes’ Theorem is simply a “strength test” between competing hypotheses, with the goal of determining their probabilities of occurrence in light of new evidence. Using the formula (Figure 1) is much easier than it seems. To do so, let’s start by dissecting it into its various components [ii] [iii].

The probability of occurrence is represented by the letter p followed by parentheses.
H represents the hypothesis we are inquiring about.
The new evidence is represented by the letter E.
The sign “|” means given.
What we aim to find is the posterior probability p(H|E), that is to say, the probability of occurrence of the hypothesis H given the new evidence E.
p(H) or prior probability represents the probability of the hypothesis we are assessing, without taking into account the new evidence. Its value expresses what we already know about the estate of the world. It might have a subjective or objective origin.
p(E|H) is the probability of occurrence of the evidence E given the hypothesis H. Put in other words, “if the hypothesis is true, how likely is the evidence”.
-H represents the competing hypothesis of H. These hypotheses are complementary, that is -H means not H. Therefore, adding up their probabilities should total no more than 1 (or its equivalent 100%).
p(E|-H) is the probability of occurrence of the evidence E given the competing hypothesis. In other words “if the competing hypothesis is true, how likely is the evidence”.

The Hypothesis Strength Chart

Now that we know the meaning of the elements, the remaining challenge is establishing each value for a given scenario. Perhaps this is the most cumbersome part (i.e. What do we mean with “if the competing hypothesis is true, how likely is the evidence?”).

There are different ways in which the logic behind the Bayes’ Theorem has been explained, for example graphically via Venn diagram or with decision-support tools such as Decision Trees; however I think it could be done in an easier way. I propose using what we will call the Hypotheses Strength Chart, a visual representation of the competing hypotheses and their relationship with the new evidence (Figure 2).

The best way to explain it is directly applying it to the “fling” question mentioned above. First, the chart is shown followed by a step-by-step description.

ree — Figure 2: Hypothesis Strength Table - Example 1

Our goal is to determine the probability that a person named X, who flirts with you at a party, in fact wants to have a fling. Hence, we have to consider the two competing hypotheses: Person X “wants” vs “doesn’t want” to have a fling with you.

We start by assigning a probability to the hypothesis H based on previous knowledge. A hypothesis is a tentative assumption made in order to draw out and test its logical or empirical consequences. To assign a probability to the hypothesis ask yourself: With all the information I have and based on my experience, how likely is it that someone that I meet at a party wants to have a fling with me?

If you don’t know the answer, give it a probability of 50% (one outcome of two possibilities). It is evident now that if the p(H) is equal to 50%, p(~H) is also equal to 50%. Remember, always aim to assign a value to H using the most objective information you have. If this kind of data is available, adjust your intuitive estimates with an external approach.

The fact that X flirted with you is a crucial piece of evidence in order to find out if he or she wants to have a fling with you. This is when it becomes interesting.

To find the value of p(E|H) ask yourself: If the hypothesis is true, how likely is this evidence? In our example, of all the people at the party that want to have a fling with me, how likely is that they flirt with me? Notice that we are now in a universe that only includes people that want to have a fling with you. We are giving it a 60% probability because there are many people that are not flirtatious even when they like someone. This value is represented by the bar with the label “+”. The bar with the negative signs “-” represents all the people at the party that want to have a fling with you but are not inclined to flirt, which is estimated at 40%.

Finally, we include the effect of the alternative hypothesis in our computation by estimating the value of p(E|~H). Of all the people that don’t want to have a fling with me, how likely is it that they flirt with me? We are giving a 10% probability because there are people who flirt even when they don’t want to have a fling.

Shown below is the process of its mathematical resolution:

ree — Mathematical resolution of Example 1

Thanks to the Bayes’ Theorem we can estimate the probability of someone who flirts with you also wants to have a fling is 85%. Remarkably, we can expect this result having an important effect on actual behavior. Assuming that the attraction with X is mutual, what would your behavior be towards him or her knowing that there are 8.5 out 10 chances that they want to have a fling with you?

As an additional comment, if you’re interested in exploring the theorem further, try out this Bayesian calculator to estimate your own posterior probabilities on the basis of your personal beliefs and experiences http://camspiers.github.io/Bayes/. I can assure you that, for this and pretty much any case you can come up with, it becomes an addictive game.

Taking Bayes a little further

Before concluding, we will use the Bayes’ Theorem to improve our estimation of probabilities on a different kind of problem.

Let’s go back to the dilemma proposed in the article “Molding Uncertainty”, where the readers were placed in a hypothetical situation in which they had to decide between staying at their job or quitting and joining a new business venture. In this example the probability of success of the enterprise was estimated at 65% (here, our Prior).

Today, Hope — your lovely friend who wants you to join her in the new business venture — tells you that she has just closed a deal with an investment firm that will give the company a large amount of money, which will allow an early start of operations.

Suppose you go online and find a study that estimates that out of the companies that succeed in the long run, 25% of them have received strong financial funding at an early stage. Additionally, 15% of businesses that fail have received this kind of investment. How can this new evidence affect your estimation of probabilities?

Figure 3 shows the Bayesian representation of this case with a Hypotheses Strength Chart. Then its mathematical resolution is presented.

ree — Figure 3: Hypothesis Strength Table - Example 2

ree — Mathematical resolution of Example 2

Using this Bayesian approach we update our hypothesis with new evidence and it shows that the probability of success of this business venture is 76% given it receives a strong early investment.

If we include this new estimate to the SEU model the option of quitting your job has a utility of 46.4, while staying at results in a utility of -15.6. Thus, with the latest evidence your degree of belief in the “success” hypothesis increases and the confidence with the decision of quitting your job is strengthened.

Finally, bear in mind that if opposing evidence arrives probability and utility estimates will likely change. This is Bayesian thinking after all.

Having a true Bayesian mindset implies revisiting our judgments and decisions as new knowledge is presented, and reevaluating — and even changing — our assumptions every time we acquire new information. This requires an attitude of curiosity and open-mindedness about latest data, and also skepticism about our prior beliefs. Personally, the Bayes Theorem is one of the concepts I wish I had learned when I was much younger, perhaps since I was taught algebra in middle school. Luckily, I can assure you, it’s never too late.

Main Takeaway

This article explained The Bayes’ Theorem, a way in which we can improve the dubious quality of our probability estimation under uncertainty, an elusive challenge that is limited by the lack of straightforward understanding of what to do with the arrival of new evidence. The Bayes’ Theorem is a paramount tool for updating our degree of belief in a hypothesis based on the occurrence of another event, potentially boosting the quality of the decisions we make. The Hypotheses Strength Chart is suggested as a way to visualize its logic.

Notes

[i] The full explanation of how to use probability estimates to make decisions with the Subjective Expected Utility (SEU) method is developed in “Molding Uncertainty”. Although the present article contains all the concepts necessary to understand the Bayes’ Theorem, I highly suggest reading the first part of the series so the Bayesian thinking role and place as a decision-making enhancer is more clear.

[ii] The derivation of the formula is beyond the goal of this article. An extensive explanation can be found in here https://plato.stanford.edu/entries/bayes-theorem/

[iii] This section is inspired by the explanation Richard Carrier gave at Skepticon 4 https://www.youtube.com/watch?v=HHIz-gR4xHo&list=FLg7_6eg-0eiz312np0-xbuQ