Probability and Statistics > Probability > Bayes’ Theorem Problems

## What is Bayes’ Theorem?

Bayes’ theorem is a way to figure out conditional probability. Conditional probability is the probability of an event happening, given that it has some relationship to one or more other events. For example, your probability of getting a parking space is connected to the time of day you park, where you park, and what conventions are going on at any time. Bayes’ theorem is slightly more nuanced. In a nutshell, it gives you the actual probability of an **event **given information about **tests**.

- “Events” Are different from “tests.” For example, there is a
**test**for liver disease, but that’s separate from the**event**of actually having liver disease. **Tests are flawed**: just because you have a positive test does not mean you actually have the disease. Many tests have a high false positive rate.**Rare events tend to have higher false positive rates**than more common events. We’re not just talking about medical tests here. For example, spam filtering can have high false positive rates. Bayes’ theorem takes the test results and calculates your*real probability*that the test has identified the event.

## The Formula

Watch the video for a quick example of working a Bayes’ Theorem problem, or read the examples below:

Bayes’ Theorem (also known as Bayes’ rule) is a deceptively simple formula used to calculate conditional probability. The Theorem was named after English mathematician Thomas Bayes (1701-1761). The formal definition for the rule is:

In most cases, you can’t just plug numbers into an equation; You have to figure out what your “tests” and “events” are first. For two events, A and B, Bayes’ theorem allows you to figure out p(A|B) (the probability that event A happened, given that test B was positive) from p(B|A) (the probability that test B happened, given that event A happened). It can be a little tricky to wrap your head around as technically you’re working backwards; you may have to switch your tests and events around, which can get confusing. An example should clarify what I mean by “switch the tests and events around.”

## Bayes’ Theorem Example #1

You might be interested in finding out a patient’s probability of having liver disease if they are an alcoholic. “Being an alcoholic” is the **test** (kind of like a litmus test) for liver disease.

**A**could mean the event “Patient has liver disease.” Past data tells you that 10% of patients entering your clinic have liver disease. P(A) = 0.10.**B**could mean the litmus test that “Patient is an alcoholic.” Five percent of the clinic’s patients are alcoholics. P(B) = 0.05.- You might also know that among those patients diagnosed with liver disease, 7% are alcoholics. This is your
**B|A:**the probability that a patient is alcoholic, given that they have liver disease, is 7%.

Bayes’ theorem tells you:

**P(A|B) = (0.07 * 0.1)/0.05 = 0.14**

In other words, if the patient is an alcoholic, their chances of having liver disease is 0.14 (14%). This is a large increase from the 10% suggested by past data. But it’s still unlikely that any particular patient has liver disease.

## More Bayes’ Theorem Examples

## Bayes’ Theorem Problems Example #2

Another way to look at the theorem is to say that one event follows another. Above I said “tests” and “events”, but it’s also legitimate to think of it as the “first event” that leads to the “second event.” There’s no one right way to do this: use the terminology that makes most sense to you.

In a particular pain clinic, 10% of patients are prescribed narcotic pain killers. Overall, five percent of the clinic’s patients are addicted to narcotics (including pain killers and illegal substances). Out of all the people prescribed pain pills, 8% are addicts. *If a patient is an addict, what is the probability that they will be prescribed pain pills?*

Step 1: **Figure out what your event “A” is from the question.** That information is in the italicized part of this particular question. The event that happens first (A) is being prescribed pain pills. That’s given as 10%.

Step 2: **Figure out what your event “B” is from the question.** That information is also in the italicized part of this particular question. Event B is being an addict. That’s given as 5%.

Step 3: **Figure out what the probability of event B (Step 2) given event A (Step 1)**. In other words, find what (B|A) is. We want to know “Given that people are prescribed pain pills, what’s the probability they are an addict?” That is given in the question as 8%, or .8.

Step 4: **Insert your answers from Steps 1, 2 and 3 into the formula and solve.**

P(A|B) = P(B|A) * P(A) / P(B) = (0.08 * 0.1)/0.05 = 0.16

The probability of an addict being prescribed pain pills is 0.16 (16%).

## Example #3: the Medical Test

A slightly more complicated example involves a medical test (in this case, a genetic test):

There are **several forms of Bayes’ Theorem **out there, and they are all equivalent (they are just written in slightly different ways). In this next equation, “X” is used in place of “B.” In addition, you’ll see some changes in the denominator. The proof of why we can rearrange the equation like this is beyond the scope of this article (otherwise it would be 5,000 words instead of 2,000!). However, if you come across a question involving medical tests, you’ll likely be using this alternative formula to find the answer:

Watch the video for a quick solution or read two solved Bayes’ Theorem examples below:

1% of people have a certain genetic defect.

90% of tests for the gene detect the defect (true positives).

9.6% of the tests are false positives.

If a person gets a positive test result, **what are the odds they actually have the genetic defect?**

The first step into solving Bayes’ theorem problems is to assign letters to events:

- A = chance of having the faulty gene. That was given in the question as 1%. That also means the probability of
*not*having the gene (~A) is 99%. - X = A positive test result.

So:

- P(A|X) = Probability of having the gene given a positive test result.
- P(X|A) = Chance of a positive test result given that the person actually has the gene. That was given in the question as 90%.
- p(X|~A) = Chance of a positive test if the person
*doesn’t*have the gene. That was given in the question as 9.6%

Now we have all of the information we need to put into the equation:

P(A|X) = (.9 * .01) / (.9 * .01 + .096 * .99) = 0.0865 (8.65%).

The probability of having the faulty gene on the test is 8.65%.

## Bayes’ Theorem Problems #4: A Test for Cancer

I wrote about how challenging physicians find probability and statistics in my post on reading mammogram results wrong. It’s not surprising that physicians are way off with their interpretation of results, given that some tricky probabilities are at play. Here’s a second example of how Bayes’ Theorem works. I’ve used similar numbers, but the question is worded differently to give you another opportunity to wrap your mind around how you decide which is event A and which is event X.

**Q. Given the following statistics, what is the probability that a woman has cancer if she has a positive mammogram result? **

- One percent of women over 50 have breast cancer.
- Ninety percent of women who have breast cancer test positive on mammograms.
- Eight percent of women will have false positives.

Step 1: Assign events to A or X. You want to know what a woman’s probability of having cancer is, given a positive mammogram. For this problem, actually having cancer is A and a positive test result is X.

Step 2: List out the parts of the equation (this makes it easier to work the actual equation):

P(A)=0.01

P(~A)=0.99

P(X|A)=0.9

P(X|~A)=0.08

Step 3: Insert the parts into the equation and solve. Note that as this is a medical test, we’re using the form of the equation from example #2:

(0.9 * 0.01) / ((0.9 * 0.01) + (0.08 * 0.99) = 0.10.

The probability of a woman having cancer, given a positive test result, is 10%.

**Remember when (up there ^^) I said that there are many equivalent ways to write Bayes Theorem?** Here is another equation, that you can use to figure out the above problem. You’ll get exactly the same result:

The main difference with this form of the equation is that it uses the probability terms *intersection*(∩) and *compliment *(^{c}). Think of it as shorthand: it’s the same equation, written in a different way.

In order to find the probabilities on the right side of this equation, use the multiplication rule:

P(B ∩ A) = P(B) * P(A|B)

The two sides of the equation are equivalent, and P(B) * P(A|B) is what we were using when we solved the numerator in the problem above.

P(B) * P(A|B) = 0.01 * 0.9 = 0.009

For the denominator, we have P(B^{c} ∩ A) as part of the equation. This can be (equivalently) rewritten as P(B^{c}*P(A|B^{c}). This gives us:

P(B^{c}*P(A|B^{c}) = 0.99 * 0.08 = 0.0792.

Inserting those two solutions into the formula, we get:

0.009 / (0.009 + 0.0792) = 10%.

## Bayes’ Theorem Problems: Another Way to Look at It.

Bayes’ theorem problems can be figured out *without* using the equation (although using the equation is probably simpler). But if you can’t wrap your head around why the equation works (or what it’s doing), here’s the non-equation solution for the same problem in #1 (the genetic test problem) above.

Step 1: Find the probability of a true positive on the test. That equals people who actually have the defect (1%) * true positive results (90%) = .009.

Step 2: Find the probability of a false positive on the test. That equals people who don’t have the defect (99%) * false positive results (9.6%) = .09504.

Step 3: Figure out the probability of getting a positive result on the test. That equals the chance of a true positive (Step 1) plus a false positive (Step 2) = .009 + .09504 = .0.10404.

Step 4: Find the probability of actually having the gene, given a positive result. Divide the chance of having a real, positive result (Step 1) by the chance of getting any kind of positive result (Step 3) = .009/.10404 = 0.0865 (8.65%).

## Other forms of Bayes’ Theorem

Bayes’ Theorem has several forms. You probably won’t encounter any of these other forms in an elementary stats class. The different forms can be used for different purposes. For example, one version uses what Rudolf Carnap called the “**probability ratio**“. The probability ratio rule states that any event (like a patient having liver disease) must be multiplied by this factor PR(H,E)=P_{E}(H)/P(H). That gives the event’s probability conditional on E. The **Odds Ratio Rule** is very similar to the probability ratio, but the likelihood ratio divides a test’s true positive rate divided by its false positive rate. The formal definition of the Odds Ratio rule is OR(H,E)=P_{H,}(E)/P_{~H}(E).

## Bayesian Spam Filtering

Although Bayes’ Theorem is used extensively in the medical sciences, there are other applications. For example, it’s used to filter spam. The **event **in this case is that the message is spam. The **test **for spam is that the message contains some flagged words (like “viagra” or “you have won”). Here’s the equation set up (from Wikipedia), read as “The probability a message is spam given that it contains certain flagged words”:

The actual equations used for spam filtering are a little more complex; they contain more flags than just content. For example, the timing of the message, or how often the filter has seen the same content before, are two other spam tests.

**Next**: Inverse Probability Distribution

If you prefer an online interactive environment to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

*Facebook page*and I'll do my best to help!

Howdy. You’re solution for question 1 is actually slightly incorrect. The prompt says that “90% of tests for the gene detect the defect” , this is the Total positive tests which includes both false positives & true positives. Therefore the test sensitivity (true positive) is 90% – 9.6% (false positives) = 80.4%.

This reduces the final likelihood of (Disease|PosTest) to only 7.8%

Hello, BoGuy,

Thanks for taking the time to comment. Upon re-reading the question, ” “90% of tests for the gene detect the defect” reads to me as ” “90% of tests for the gene ACTUALLY detects the defect”. It’s not inclusive of false positives. I may reword this so it’s clearer,

Regards,

S

pat has asked a professor for a recommendation for graduate school. he estimates that the probabilty that the letter will be strong is .5, the probability that the letter will be weak is .2, and mediocre is .3. he alstteo estimates that if the letter is strong, the probability that he will get the job is .8; if it is weak, .1; and if it is mediocre, then .4. given that he did get the job, what is the probability that the letter was strong?

In example 1, you state P(B) = 0.05 but when you plug it into the equation, you typed 0.5 instead of 0.05. P(A|B) = (0.07 * 0.1)/0.5 = 0.14

I did type that! Thanks for spotting it, I made the correction :)

Hi Stephanie,

First of all, thanks for the great site. I have also purchased your book via iTunes bookstore.

Just a question with regarding Example #2.

The question asks, what is the probability of been an addict if you are prescribed pain killers.

And this is what I have following your explanation.

P(addict | pain killers) = p(pain killers | addict) * p(addict)/ p(pain killers)

and plugging in the numbers… i get…

0.08 x 0.05 / 0.1 = 0.04

If I have missed something please let me know.

Thanks.

Vincent

Hi Andale,

Thanks for the prompt reply. However, reading the question I still get the same answer. I have done a 2×2 table based on the percentages and I have the following…

N(+|-) = narcotic status

A(+|-) = addict status

Based on a clinic of 100 patients.

p(N+) = 0.1 (10 in the marginal total row for N+)

p(A+) = 0.05 (5 in the marginal total column total for A+)

p(N+ | A+) = 0.08 (0.4 in the 2×2 table)

| A+ | A- |

____|______|_______|___

N+ | 0.4 | 9.6 | 10

___|_______|_______|___

N- | 4.6 | 85.4 | 90

___|_______|_______|___

| 5 | 95 | 100

p(A+ | N+) = 0.4 / 10 = 0.04

Thanks.

The question that you are asking here:

p(A+ | N+)

is “If a person is an addict, what is the probability they get addicted to narcotics”.

However, note the wording for the problem in the article: it’s completely different:

p(N+ | A+)

“If a patient is prescribed pain pills, what is the probability that they will become an addict?”

So the problem here isn’t the calculations, it’s that you seem to be asking a very different question.

Hi Andale,

Yes. I see the problem now. I had it back to front.

Thanks for the clarification.

Hi Stephanie,

Thank you also for the great site.

I’m picking up on Vincent’s query about Example #2. I have gone over the example several times and am convinced that A and B are switched around in your example.

Your last reply to Vincent said:

p(A+ | N+) is “If a person is an addict, what is the probability they get addicted to narcotics”.

Isn’t that the opposite to the definition of P(A|B) = probability of observing event A given that B is true?

The answer also seems intuitively wrong to me. 10% of patients are prescribed pain killers. But out of the addicts, only 8% are prescribed pain killers. So intuitively I would expect that if you are prescribed pain killers then you would have a less-than-average likelihood of being an addict. But the answer of 16% is way above the unconditional probability of being an addict 5% – it just doesn’t seem right.

Sorry if I’m being dense :-)

Cheers,

James

“Intuitively” usually doesn’t work with Bayes. Your statement “if you are prescribed pain killers then you would have a less-than-average likelihood of being an addict” doesn’t make intuitive sense to

me;) After all, I have NO idea about the true “Addictiveness” of pain pills or how many “normal” people get addicted. I’m just using these numbers as examples.As for “p(A+ | N+) is “If a person is an addict, what is the probability they get addicted to narcotics”… “Isn’t that the opposite to the definition of P(A|B) = probability of observing event A given that B is true?”” Kind of? But not really. The “opposite” to that particular statement would actually be “what is the probability a person gets addicted to narcotics if they are an addict?” I’d say that’s 100%, but it is is not what the question is asking here.

Thanks for the great site. After working on Example #2 I have to agree with James. I think the confusion with Example #2 stems from the wording of the prompt. Prompt: “If a patient is prescribed pain pills, what is the probability that they will become an addict?” Isn’t this the same as: What is the probably a patient will become an addict given they are prescribed pain pills? Regardless of wording, I think event (A) is becoming an addict and event (B) (i.e. the event that happens first) is being prescribed pain pills. However, in your solution you switched A and B and solved for P(prescribed pain pills | addict).

Am having some little issues in proving baye’s theorem

I’m with James – are you really sure Example #2 is correct?

5% are addicts P(A)=5%

10% are prescribed narcotics P(N)=10%

8% of addicts have been prescribed narcotics P(N|A)=8% (ie if you are an addict there is an 8% chance you have been prescribed narcotics)

So the probability that someone prescribed narcotics will be(come) an addict is P(A|N) = P(A) x P(N|A) / P(N) = 5% x 8% / 10% = 4% (not 16%)

This is analogous to Example #1

10% have liver disease P(L)=10%

5% are alcoholics P(A)=5%

7% of those with liver disease are alcoholics P(A|L)=7% (ie if you are have liver disease there is a 7% chance you are an alcoholic)

so the probability that someone who is an alcoholic will have liver disease is P(L|A) = P(L) x P(A|L) / P(L) = 10% x 7% / 5% = 14%

Something must be wrong, but I can’t see it…

You put that:

P(A|N) = P(A) x P(N|A) / P(N) = 5% x 8% / 10% = 4% (not 16%)

But the event that happens first (event A) is being prescribed narcotics (P(N)). You put P(A) — being an addict. The probability of being an addict (P(B) in my notation) is the second event, and that goes in the denominator.

Samuel,

I don’t cover proofs on the site, but you can find a fairly good one here.

Thanks for the compliment on the site :)

I’ve made some improvements to the question. I hope that clears things up.

Really sorry, I’m still missing something. P(B) goes in the denominator if you want to calculate P(A|B), which is the probability of the first event A given the second event B (ie the probability of A, having been prescribed narcotics, if B, you are an addict). But this is the 8% we are already given in the question. What we actually want to know is the probability of becoming an addict if you are prescribed narcotics, P(B|A), so we need to rearrange the formula and P(A) goes in the denominator. Sorry for being so dense, but I can’t figure out why this is wrong.

I do agree with Andy and James.

You have mentioned, ” Among the addicts in the clinic, 8% have been prescribed narcotics.”. it means 8% of the addicts are prescribed narcotics. So, how can you take P(B/A)which is “if the event, narcotics are prescribed is true, probability that the patient is an addict is true” as 8 %? P(B/A) is what we should find.

And P(A/B) which is “if the event, narcotics are prescribed is true, the probability the patient is an addict is true” is what , that needs to be find.

This site was very helpful

Please lot example for bays theorem

thanks for this explanation :*

5% of the people have high blood pressure.of the people with high blood pressure,75 % drink alcohol;where as,only 50% of the people without high blood pressure drink alcohol.what is the percent of the drinker have blood pressure?

How far do you get with your problem before you get stuck?

Dear IBSA,

Greetings !!, Thank you for the post number 25, involving the problem on alcoholics and high blood pressure.

The solution would be:

P(HBP) = 0.05

naturally, P(no HBP) = 1-0.05 = 0.95

P(people having HBP, given, they drink alcohol)

P(HBP/alc) = 0.75

P(people having no HBP, given, they drink alcohol)

P(no HBP/alc) = 0.5

We need to find

P(people who drink alcohol, given, they have HBP)

in other words, find P(alc/HBP) = ??

Lets recall the Bayes’ rule:

P(A/B) = [P(B/A)*P(A)] / [{P(B/A)*P(A)} + {(P(B/A’)*P(A’)}]

Now in this problem,

A = alc

B = HBP

Substituting for A and B in the above equation, we arrive at

P(alc/HBP) = [P(HBP/alc)*P(alc)] / [{P(HBP/alc)*P(alc)} + {(P(HBP/no alc)*P(no alc)}]

Now if we carefully examine the problem statement, we find that p(alc) and p(no alc) values are missing.

So the conclusion would be the problem remains unsolved due to missing data, and this is probably the reply to post number 26, that we get stuck at this point of solving the problem.

Hope this helps.