This article discusses the Pearson correlation coefficient. If you want to know how to use it, see this article on How to Use the Pearson Correlation coefficient.
What is the Pearson Correlation Coefficient?
Correlation between variables is a measure of how well the variables are related. The most common measure of correlation in statistics is the Pearson Correlation (technically called the Pearson Product Moment Correlation or PPMC), which shows the linear relationship between two variables. Two letters are used to represent the Pearson correlation: Greek letter rho (ρ) for a population and the letter “r” for a sample.
What are the Possible Values for the Pearson Correlation?
Results are between -1 and 1. A result of -1 means that there is a perfect negative correlation between the two values at all, while a result of 1 means that there is a perfect positive correlation between the two variables. A result of 0 means that there is no linear relationship between the two variables. You will very rarely get a correlation of 0, -1 or 1. You’ll get somewhere in between. The closer the value of r gets to zero, the greater the variation the data points are around the line of best fit.
High correlation: .5 to 1.0 or -0.5 to 1.0
Medium correlation: .3 to .5 or -0.3 to .5
Low correlation: .1 to .3 or -0.1 to -0.3
The PPMC does not differentiate between dependent and independent variables. For example, if you are investigating the correlation between a high caloric diet and diabetes, you might find a high correlation of .8. However, you could also run a PPMC with the variables switched around (diabetes causes a high caloric diet), which would make no sense. Therefore, as a researcher you have to be mindful of the variables you are plugging in. In addition, the PPMC will not give you any information about the slope of the line — it only tells you whether there is a high correlation.
Real Life Example
Pearson correlation is used in thousands of real life situations. For example, scientists in China wanted to know if there was a correlation between spatial distribution and genetic differentiation in weedy rice populations in a study to determine the evolutionary potential of weedy rice. The graph below shows the observed heterozygosity of weedy rice plotted against the multilocus outcrossing rate. Pearson’s correlation between the two groups was analyzed, showing a significant positive correlation of between 0.783 and 0.895 for weedy rice populations.
If you’re interested in seeing more examples of PPMC, you can find several studies on the National Institute of Health’s Openi website, which shows result on studies as varied as breast cyst imaging to the role that carbohydrates play in weight loss.