Probability Distributions > Marginal Distribution

## What is a Marginal distribution?

Watch the video or read the article below:

The technical definition can be a little mind-numbing to look at:

Definition of a marginal distribution = If X and Y are discrete random variables and f (x,y) is the value of

their joint probability distribution at (x,y), the functions given by:

g(x) = Σ_{y}f (x,y) and h(y) = Σ_{x}f (x,y) are the marginal distributions of X and Y , respectively.

If you’re great with equations, that’s probably all you need to know. It tells you how to find a marginal distribution. But if that formula gives you a headache (which it does to most people!), you can use a frequency distribution table to find a marginal distribution.

A **marginal distribution** gets it’s name because it appears in the *margins* of a probability distribution table.

Of course, it’s not *quite* as simple as that. You can’t just look at any old frequency distribution table and say that the last column (or row) is a “marginal distribution.” Marginal distributions follow a couple of rules:

- The distribution must be from bivariate data. Bivariate is just another way of saying “two variables,” like X and Y. In the table above, the random variables i and j are coming from the roll of two dice.
- A marginal distribution is where you are only interested in
*one*of the random variables . In other words, either X**or**Y. If you look at the probability table above, the sum probabilities of one variable are listed in the bottom row and the other sum probabilities are listed in the right column. So this table has*two*marginal distributions.

## Difference Between Marginal Distribution and Conditional Distribution.

A conditional distribution is where we are only interested in a particular sub-population of our entire data set. In the dice rolling example, this could be “rolling a two” or “rolling a six.” The image below shows two highlighted sub-populations (and therefore, two conditional distributions).

## How to Calculate Marginal Distribution Probability

**Sample question: **Calculate the marginal distribution of pet preference among men and women:

**Solution:**

Step 1: Count the total number of people. In this case the total is given in the right hand column (22 people).

Step 2: Count the number of people who prefer each pet type and then turn the ratio into a probability:

People who prefer cats: 7/22 = .32

People who prefer fish: 7/22 = .32

People who prefer dogs: 8/22 = .36

**Tip:** You can check your answer by making sure the probabilities all add up to 1.

**Sample question 2 (Mutually Exclusive Events)**: If P(A) = 0.20, P(b) = 0.70, and both events are mutually exclusive, find P(B’∩A), P(B’∩A’) and P(B∩A’).

If you’re unfamiliar with this notation, P(A’) means “not A”, or the complement. P(B’∩A) means “the intersection of not B and A”).

**Answer**:

You *could * figure out the probabilities individually, but they’re much easier to figure out using a table.

Step 1: Fill in a frequency table with the given information. The total probability must equal 1, so you can add that to the margins(totals) as well. Simple addition/algebra fills in the marginal blanks. For example, on the bottom row 0.70 + x = 1.00 so The marginal total for B’ must be 0.30.

Step 2: Add 0 for the intersection of A and B, at the top left of the table. You can do that because A and B are mutually exclusive and cannot happen together.

Step 3: Fill in the rest of the blanks using simple addition/algebra.

Reading from the table (look at the intersections of the two stated probabilities):

P(B’∩A) = 0.20

P(B’∩A’) = 0.10

P(B∩A’) = 0.70.

**Sample question 3 (Independent Events)**: If P(A) = 0.20, P(b) = 0.70, and both events are independent, find P(B’∩A), P(B’∩A’) and P(B∩A’).

**Answer**: This time, A and B are independent, so the probability of them both happening at the same time is 0.14 (P(A)*P(B) = 0.20 * 0.70 = 0.14). This value goes into the top left (intersection of A and B). Fill out the rest of the table exactly the same way as in the steps above.

Read the answers from the table (from the intersections of the two probabilities):

P(B’∩A): 0.06

P(B’∩A’): 0.24

P(B∩A’): 0.56.

**Sample question 4 (Conditional Probability)**: Given that P(A) = 0.20, P(B) = 0.60, P(B|A) = 0.50, what is (B∩A’)? (A’∩B’)?

The key to filling out this table is to find the intersection of B and A in the top left (B∩A). The question states that P(B|A) = 50%. In words, 50% of A is in B. 50% of the 20% which was in event A is .5 * .2 = 0.10.

Reading from the table:

P(B∩A’) = 0.50.

P(A∩B’) = 0, which means the events A’ and B’ are mutually exclusive.

**Need help with a homework or test question?** Chegg offers 30 minutes of free tutoring, so you can try them out before committing to a subscription. Click here for more details.

If you prefer an **online interactive environment** to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

*Facebook page*.