Statistics How To

What is Correlation in Statistics? Correlation Analysis Explained

Statistics Definitions > What is Correlation?

What is Correlation?

Correlation is used to test relationships between quantitative variables or categorical variables. In other words, it’s a measure of how things are related. The study of how variables are correlated is called correlation analysis.

Some examples of data that have a high correlation:

  • Your caloric intake and your weight.
  • Your eye color and your relatives’ eye colors.
  • The amount of time your study and your GPA.

Some examples of data that have a low correlation (or none at all):

  • Your sexual preference and the type of cereal you eat.
  • A dog’s name and the type of dog biscuit they prefer.
  • The cost of a car wash and how long it takes to buy a soda inside the station.

Correlations are useful because if you can find out what relationship variables have, you can make predictions about future behavior. Knowing what the future holds is very important in the social sciences like government and healthcare. Businesses also use these statistics for budgets and business plans.

What is Correlation: The Correlation Coefficient.

A correlation coefficient is a way to put a value to the relationship. Correlation coefficients have a value of between -1 and 1. A “0” means there is no relationship between the variables at all, while -1 or 1 means that there is a perfect negative or positive correlation (negative or positive correlation here refers to the type of graph the relationship will produce).

what is correlation

Graphs showing a correlation of -1, 0 and +1

What is Correlation: Types of correlation coefficients.

The most common correlation coefficient is the Pearson Correlation Coefficient. It’s used to test for linear relationships between data. In AP stats or elementary stats, the Pearson is likely the only one you’ll be working with. However, you may come across others, depending upon the type of data you are working with. For example, the Goodman and Kruskal lambda coefficient is a fairly common coefficient. The Goodman and Kruskal lambda coefficient can be symmetric, where you do not have to specify which variable is dependent, and asymmetric where the dependent variable is specified.

Goodman and Kruskal lambda coefficient

ε1 is the overall non-modal frequency and ε2 is the sum of the non-modal frequencies for each value of the independent variable.


Need help with a homework or test question? Chegg offers 30 minutes of free tutoring, so you can try them out before committing to a subscription. Click here for more details.

If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.

Comments? Need to post a correction? Please post on our Facebook page.
What is Correlation in Statistics? Correlation Analysis Explained was last modified: November 13th, 2017 by Stephanie