Statistics How To

Bivariate Analysis Definition & Example

Statistics Definitions > Bivariate Analysis

What is Bivariate Data?

Data in statistics is sometimes classified according to how many variables are in a particular study. For example, “height” might be one variable and “weight” might be another variable. Depending on the number of variables being looked at, the data might be univariate, or it might be bivariate.

When you conduct a study that looks at a single variable, that study involves univariate data. For example, you might study a group of college students to find out their average SAT scores or you might study a group of diabetic patients to find their weights. Bivariate data is when you are studying two variables. For example, if you are studying a group of college students to find out their average SAT score and their age, you have two pieces of the puzzle to find (SAT score and age). Or if you want to find out the weights and heights of college students, then you also have bivariate data. Bivariate data could also be two sets of items that are dependent on each other. For example:

  • Ice cream sales compared to the temperature that day.
  • Traffic accidents along with the weather on a particular day.

Bivariate data has many practical uses in real life. For example, it is pretty useful to be able to predict when a natural event might occur. One tool in the statistician’s toolbox is bivariate data analysis. Sometimes, something as simple as plotting on variable against another on a Cartesian plane can give you a clear picture of what the data is trying to tell you. For example, the scatterplot below shows the relationship between the time between eruptions at Old Faithful vs. the duration of the eruption.

bivariate data.

Waiting time between eruptions and the duration of the eruption for the Old Faithful Geyser in Yellowstone National Park, Wyoming, USA. This scatterplot suggests there are generally two “types” of eruptions: short-wait-short-duration, and long-wait-long-duration.

What is Bivariate Analysis?

Bivariate analysis means the analysis of bivariate data. It is one of the simplest forms of statistical analysis, used to find out if there is a relationship between two sets of values. It usually involves the variables X and Y.

The results from bivariate analysis can be stored in a two-column data table. For example, you might want to find out the relationship between caloric intake and weight (of course, there is a pretty strong relationship between the two. You can read more here.). Caloric intake would be your independent variable, X and weight would be your dependent variable, Y.
bivariate analysis

Bivariate analysis is not the same as two sample data analysis. With two sample data analysis (like a two sample z test in Excel), the X and Y are not directly related. You can also have a different number of data values in each sample; with bivariate analysis, there is a Y value for each X. Let’s say you had a caloric intake of 3,000 calories per day and a weight of 300lbs. You would write that with the x-variable followed by the y-variable: (3000,300).
Two sample data analysis
Sample 1: 100,45,88,99
Sample 2: 44,33,101
Bivariate analysis

Types of Bivariate Analysis

Common types of bivariate analysis include:

1. Scatter plots,

These give you a visual idea of the pattern that your variables follow.

A simple scatterplot.

A simple scatterplot.

2. Regression Analysis

Regression analysis is a catch all term for a wide variety of tools that you can use to determine how your data points might be related. In the image above, the points look like they could follow an exponential curve (as opposed to a straight line). Regression analysis can give you the equation for that curve or line. It can also give you the correlation coefficient.

3. Correlation Coefficients

Calculating values for correlation coefficients are using performed on a computer, although you can find the steps to find the correlation coefficient by hand here. This coefficient tells you if the variables are related. Basically, a zero means they aren’t correlated (i.e. related in some way), while a 1 (either positive or negative) means that the variables are perfectly correlated (i.e. they are perfectly in sync with each other).

If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.

Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!
Bivariate Analysis Definition & Example was last modified: October 15th, 2017 by Andale

11 thoughts on “Bivariate Analysis Definition & Example

  1. Anorue Ugochukwu Justin

    I need your help please. Let (x,y) denote the cdf of the bivariate rv (X,Y). Show that for a < b and c < d

    P(a < X _< b,c < Y _< d) = F(b,d) – F(a,d) – F(b,c) + F (a,c)

  2. adenike jimoh

    Thanks, this piece is quite useful.
    I am trying to relate educational level (no formal eudcation, primary education, tertiary education) and developmental quotient (normal or delay).
    Is this a bivariate analysis?

  3. KS033

    I’m researching the relation between Volatility of stocks and Macro-Economic factors. Should I use Bivariate or Multivariate Analysis?