How to Compute Pearson’s Correlation Coefficients
Correlation coefficients are used in statistics to measure how strong a relationship is between two variables. There are several types of correlation coefficient: Pearson’s correlation or Pearson correlation is a correlation coefficient commonly used in linear regression. If you like our easy to follow explanations of statistics, check out our easy to follow book, which has hundreds more examples, just like this one.
Sample question: compute the value of the correlation coefficient from the following table:
| Subject | Age x | Glucose Level y | 1 | 43 | 99 |
|---|---|---|
| 2 | 21 | 65 | 3 | 25 | 79 |
| 4 | 42 | 75 | 5 | 57 | 87 |
| 6 | 59 | 81 |
Step 1:Make a chart. Use the given data, and add three more columns: xy, x2, and y2.
| Subject | Age x | Glucose Level y | xy | x2 | y2 | 1 | 43 | 99 |
|---|---|---|---|---|---|
| 2 | 21 | 65 | 3 | 25 | 79 |
| 4 | 42 | 75 | 5 | 57 | 87 |
| 6 | 59 | 81 |
Step 2::Multiply x and y together to fill the xy column. For example, row 1 would be 43 × 99 = 4,257.
| Subject | Age x | Glucose Level y | xy | x2 | y2 | 1 | 43 | 99 | 4257 |
|---|---|---|---|---|---|
| 2 | 21 | 65 | 1365 | 3 | 25 | 79 | 1975 |
| 4 | 42 | 75 | 3150 | 5 | 57 | 87 | 4959 |
| 6 | 59 | 81 | 4779 |
Step 3: Take the square of the numbers in the x column, and put the result in the x2 column.
| Subject | Age x | Glucose Level y | xy | x2 | y2 | 1 | 43 | 99 | 4257 | 1849 |
|---|---|---|---|---|---|
| 2 | 21 | 65 | 1365 | 441 | 3 | 25 | 79 | 1975 | 625 |
| 4 | 42 | 75 | 3150 | 1764 | 5 | 57 | 87 | 4959 | 3249 |
| 6 | 59 | 81 | 4779 | 3481 |
Step 4: Take the square of the numbers in the y column, and put the result in the y2 column.
| Subject | Age x | Glucose Level y | xy | x2 | y2 | 1 | 43 | 99 | 4257 | 1849 | 9801 |
|---|---|---|---|---|---|
| 2 | 21 | 65 | 1365 | 441 | 4225 | 3 | 25 | 79 | 1975 | 625 | 6241 |
| 4 | 42 | 75 | 3150 | 1764 | 5625 | 5 | 57 | 87 | 4959 | 3249 | 7569 |
| 6 | 59 | 81 | 4779 | 3481 | 6561 |
Step 5: Add up all of the numbers in the columns and put the result at the bottom.2 column. The Greek letter sigma (Σ) is a short way of saying “sum of.”
| Subject | Age x | Glucose Level y | xy | x2 | y2 | 1 | 43 | 99 | 4257 | 1849 | 9801 |
|---|---|---|---|---|---|
| 2 | 21 | 65 | 1365 | 441 | 4225 | 3 | 25 | 79 | 1975 | 625 | 6241 |
| 4 | 42 | 75 | 3150 | 1764 | 5625 | 5 | 57 | 87 | 4959 | 3249 | 7569 |
| 6 | 59 | 81 | 4779 | 3481 | 6561 |
| Σ | 247 | 486 | 20485 | 11409 | 40022 |
Step 6:Use the following formula to work out the correlation coefficient.

The answer is: 2868 / 5413.27 = 0.529809
Click here if you want easy, step-by-step instructions for solving this formula.
From our table:
- Σx = 247
- Σy = 486
- Σxy = 20,485
- Σx2 = 11,409
- Σy2 = 40,022
- n is the sample size, in our case = 6
so the correlation coefficient =
- 6(20,485) – (247 × 486) / [√[[6(11,409) - (2472)] × [6(40,022) - 4862]]]
=0.5298
The range of the correlation coefficient is from -1 to 1. Since our result is 0.5298 or 52.98%, which means the variables have a moderate positive correlation.
Like the explanation? Check out our statistics how-to book, with a how-to for every elementary statistics problem type.
Feel like cheating at statistics?
How to Do Everything Statistics » How to Test for Correlation Coefficient said:
Nov 05, 09 at 10:45 am[...] coefficient. Sample question: test the significance of the correlation coefficient r=0.565 (How to calculate a correlation coefficient) using the critical values for PPMC table. Test at α=0.01 for a sample size of 9. Step 1: [...]
How to Do Everything Statistics » How to Find the Coefficient of Determination said:
Nov 05, 09 at 12:00 pm[...] values. Finding the coefficient of determination takes only three steps! Step 1: Find the correlation coefficient, r (it may be given to you in the question). Example, [...]
How to Do Everything Statistics » How to Find a Linear Regression Equation said:
Nov 05, 09 at 12:38 pm[...] When a correlation coefficient shows that data is likely to be able to predict future outcomes, statisticians use linear regression to find a predictive function. If you recall from elementary algebra, the equation for a line is y=mx+b. This article shows you how to take data, calculate linear regression, and find the equation y’=a+bx. Step 1: Make a chart of your data, filling in the columns in the same way as you would fill in the chart if you were finding the Pearson’s Correlation Coefficient. [...]
Jennifer Thomas said:
Nov 08, 09 at 8:28 pmI’m a little confused. Based on the formula, I thought that instead of squaring 114092 and 40022, you should square x (247)and y (486).
Bill Bryan said:
Dec 01, 09 at 11:22 amI think this is the part of the course that you can feel your brain growing larger. The Correlation Coefficient equation is a long process, if only there was a way to shorten the problem.
Donna Allen said:
Dec 02, 09 at 7:42 pmI too wish there was a shorter way to do this problem. I’m just thankful that I actually understand how to work the problem. Your explanation was helpful and easy to follow. Thank you!
Vanessa said:
Dec 03, 09 at 9:57 pmThis example was really helpful and I understand how to calculate the problem and how to do all of the steps but the only problem I am having is how did you get the final answer which in your example it says 1.44281 … in mathzone I did the whole problem like it said and I even saw the example and it was right but the final answer I dont know how they got to. i got 2.14866937 E -4 , but the answer was 0.947. please help me .. im I missing something?
Vanessa said:
Dec 04, 09 at 12:54 amI understand, I just figured out the right answer now. I didnt know I had to square root the bottom part, and even though this helped me alot, i used google and they helped me figure out the last part by explaining everything step by step and unfortunately thats what i need.
Alison Bryant said:
Apr 26, 10 at 12:19 pmI have found that it is easiest, and you get the same answer by going through the Linreg function on the calculator, it gives you the correlation coefficent as well as the correlation of determination.
Tony said:
Apr 07, 11 at 4:10 amExcellent example. A couple of mistakes though! 6×11409 = 68454
Also you must take the square-root of the denominator. I make the answer 0.5298
Ronak said:
Aug 02, 11 at 11:22 amOn this page you showed that r’s denominator is a square root
http://www.statisticshowto.com/articles/how-to-compute-pearsons-correlation-coefficients/
but on this page, you didn’t do it.
http://www.statisticshowto.com/help-with-statistics-equations/
Also, step 5 is wrong.
6 * 11409 = 68,454 not 66,294
In step 7, you used 68,454 which is correct but when you subtracted 61009 from it you got the incorrect value of 5,285.
68,454 – 61,009 = 7,445
In step 11, it becomes 7,445 * 3936 = 29303520.
The final answer should be 2896 / 29303520 = 9.78722e-05 = 0.000097822
Regards,
nick bhullar said:
Oct 09, 11 at 6:14 pmyou are absolutely correct
ARCHANA said:
Oct 15, 11 at 4:20 amITS GOOD,EASY TO UNDERSTAND
Stephanie said:
Oct 27, 11 at 9:09 pmThanks for spotting the error in the formula! An update is on the way for the long step from the book. In the meantime, this page has been updated with the correct answer (thanks, Tony!).
Seema Dessai said:
Dec 22, 11 at 9:59 pmThis explanation i personally found to be the best after going through many explanation based on the same formula.Thank you very much for such a simple and understanding method of explanation of Pearson’s Correlation Coefficients.
Statistics How To» Blog Archive » How to Find a Linear Regression Slope said:
Jan 03, 12 at 11:15 am[...] If you don’t remember how to get those variables from data, see this article on how to find a Pearson’s correlation coefficient. Follow the steps there to create a table and find Σx, Σy, Σxy, Σx2, and [...]
Molly said:
Apr 19, 12 at 11:45 amwow, thank you so much, the steps are wonderfully helpful and adaptable
paresh said:
May 27, 12 at 9:10 pmA B C D E F G H
10.8 49.9 99.65 200.8 1.004 5.005 10.004 50.01
10.8 49.9 99.65 200.8 1.004 5.005 10.004 50.01
10.8 49.9 99.65 200.8 1.004 5.005 10.004 50.01
AVg
10.8 49.9 99.65 200.8 1.004 5.005 10.004 50.01
How to calculate corelation coefficient of above .
Habiba Abdi said:
Jun 12, 12 at 4:28 amkindly assist on how to calculate the correlation coefficient step by step
jonathan D Hantapat said:
Jul 22, 12 at 7:20 amThank you so much for the easy self explanatory examples given. I really appreciate it. I was given an assignment for analytical chemistry on statistics and so thankful that this website help solve my 75% of my assignment.
Andale said:
Jul 22, 12 at 7:47 amGreat! That’s what the site is here for. It’s always nice to hear it helped out :)
Stephanie
shawty said:
Aug 13, 12 at 4:17 amperfect example.has great step by step guidance which makes it very easy to understand
Genardo_27 said:
Aug 28, 12 at 10:00 pmwill some one help us how to solve genetic correlation problems? it is our report yet were not ready>> we are totally dead to our very owned teacher… S O S..
pls. add link to my f.b page. genard_perias27@yahoo.com
thank you!
Erika P said:
Sep 12, 12 at 10:18 pmSuper easy to follow, loved the whole “step-by-step” thing! I’m seriously mathematically challenged, so I was super happy to have found such a helpful website!
Isaac Zaji said:
Sep 16, 12 at 10:17 amYour steps of calculating correlation coefficient is wonderful
Andale said:
Sep 17, 12 at 5:04 amThanks, Isaac!
Pixie said:
Sep 21, 12 at 11:50 amMust bookmark this site! Extremely helpful in taking online classes and trying to teach myself statistics. I kept getting numbers like 38 for r until I read this article v.v
Bharat prajapati said:
Sep 30, 12 at 1:38 amIt is very helpful website.We can easy to understand our question from this website.
Abel said:
Oct 23, 12 at 4:25 amthanks your steps makes it easy to understand!
kobby maloiso said:
Oct 30, 12 at 2:34 amhelp me to answer this please its urgent:(a researcher correlated the MTAI scores of a group of 100 experienced secondary school teachers with the number of students each teacher failed in a year.He obtained an r of -0.39.He concluded that teachers tend to fail students because they do not have “accepting”attitudes towards students. Comment on the researcher’s methods and conclusions.
Andale said:
Oct 30, 12 at 11:03 amHello, Kobby,
Please ask your question on the forum and one of our mods will get back to you:
http://www.statisticshowto.com/forums/
Thanks!
Stephanie
CW said:
Nov 01, 12 at 10:15 amRemember to get the square root of the denominator before dividing!
The formala as shown should show:
6(20,485) – (247 × 486) /
[ [6(11,409) - (2472)] × [6(40,022) - 4862]] <– square root!
=0.5298
ZogaraUmmey Hassan said:
Nov 08, 12 at 3:04 amExcellent. Help me a lot to find out correlation coefficients.
MARIA said:
Nov 12, 12 at 10:03 amX Y
7 9
8 11
12 12
4 13
16 15
18 17
10 18
FIND COEFFICIENT OF CORRELATION BY PEARSON METHOD PLEASE?
Andale said:
Nov 12, 12 at 1:29 pmHi, Maria,
Would you mind posting this in our forum? One of our mods would be happy to help :)
http://www.statisticshowto.com/forums/
Thanks,
Stephanie
MARIA said:
Nov 13, 12 at 10:06 pmthanx Stephanie. To share something more ,can u mail at
qaseem65@gmail.com
MARIA said:
Nov 13, 12 at 10:09 pmHi kobby
To me,teachers r right
MARIA said:
Nov 13, 12 at 10:15 pmCan someone help me tell about level of significance at 0.05/0.01 from Chai square table,with reference to hopothesis H0/HA ?? thx
Ria said:
Nov 15, 12 at 11:32 amHi everyone, help please.
I am very new to stats but need to grasp it quickly to analyze data in my thesis.
I want to find out the nature of relationship between pop love songs themes and imagined interactions (Imagined Interaction Theory. The instrument to measure imagined interactions is a 7point interval scale ranging from strongly disagree to strongly agree.
Can I use Pearson’s R to test the coefficient between my variables: love songs themes and imagined interactions?
Thanks in advance for help
Justin said:
Nov 15, 12 at 1:12 pmcan anyone help me do this stat homework?
1) Watson & Watson Repair Inc. provides maintenance service for a large apartment complex in downtown
Saint Petersburg, Florida. W & W managers are evaluating the possibility of hiring another maintenance
person because it seems maintenance calls are increasing. Rafael Roddick and Andy Nadal are currently
responsible for maintenance tasks. To investigate “what” drives Repair Time, the managers hire you as
statistician to conduct a regression analysis. The table below provides data from a random selected sample of
10 maintenance calls.
a. (1pt)How would you include “responsible for maintenance” in your regression? (How would you define
it?)
STEP 1: use the dummy variable REPAIRPERSON = 1 IF responsible = RAFAEL
REPAIRPERSON = 0 IF responsible = ANDY
A regression model is set up using ONLY repairperson as variable to explain REPAIRTIME
a. (1pt) Comment on the “correlation” between Repairperson and Repairtime.
b. (1pt) Comment on goodness of fit of the model
Maintenance
Call
Repair
Time
(hours)
Months
Since
Last Service
Responsible for
maintenance
1 2.9 3 Rafael Roddick
2 3 3.9 Rafael Roddick
3 4.8 8.2 Andy Nadal
4 1.8 3 Rafael Roddick
5 2.9 2 Rafael Roddick
6 4.9 7 Andy Nadal
7 4.4 9 Andy Nadal
8 4.5 8.5 Andy Nadal
9 4.4 4 Andy Nadal
10 4.5 6 Rafael Roddick
Correlations
Repairtime Repair
Pearson
Correlation
Repairtime 1.000 -.783
Repairperson -.783 1.000
Model Summary
b
Model R R Square
Adjusted R
Square
Std. Error of the
Estimate
1 .783
a
.614 .565 .70071
a. Predictors: (Constant), Repairperson
b. Dependent Variable: Repairtime2
c. (2pt) Report the statistical significance of the coefficients.
STEP 2: Use MONTHS SINCE LAST SERVICE AND REPAIRPERSON in a regression to explain REPAIRTIME
a. (1pt) Comment on the scatter diagram for Months-since-last-service and Repairtime.
b. (2pt) Comment on goodness of fit of the model. Do you find any difference with respect to the goodness of
fit of the model in STEP 1?
c. (1pt) Comment on the normality assumptions of the model.
Coefficients
a
Model
Unstandardized Coefficients
Standardized
Coefficients
B Std. Error Beta t Sig.
1 (Constant) 4.600 .313 14.679 .000
Repairperson -1.580 .443 -.783 -3.565 .007
a. Dependent Variable: Repairtime
Model Summary
b
Model R R Square Adjusted R Square
Std. Error of the
Estimate
1 .839
a
.705 .620 .65498
a. Predictors: (Constant), monthslastservice, Repairperson
b. Dependent Variable: Repairtime3
d. (3pt) Report the statistical significance of the coefficients.
e. (1pt) Why do you think the statistical significance of the coefficient for repairperson has changed from step 1
to step 2?
In Step 1, repairperson was the only variable explaining repair time. It seems that the
combining this variable with months since last service the, repairperson loses explanatory
power, which is reflected in the SS of the coefficient.
f. (1pt) Write down the estimated regression equation.
g. (2pt) Interpret the intercept for this model
h. (2pt) Provide an interpretation for the slope coefficients of the model.
Coefficients
a
Unstandardized Coefficients
Standardized
Coefficients
B Std. Error Beta t Sig.
(Constant) 3.195 1.001 3.192 .015
Repairperson -.860 .642 -.426 -1.340 .222
monthslastservice .191 .130 .467 1.468 .18
STEP 3: Use MONTHS SINCE LAST SERVICE to capture the curvature explaining REPAIRTIME
1) (2pt) From all models bellow, which you think is best?
Model Summary and Parameter Estimates
Dependent Variable:Repairtime
Equation
Model Summary Parameter Estimates
R Square F df1 df2 Sig. Constant b1 b2 b3
Linear .629 13.558 1 8 .006 2.036 .325
Quadratic .709 8.531 2 7 .013 .213 1.130 -.072
Cubic .765 6.515 3 6 .026 3.639 -1.227 .405 -.029
The independent variable is monthslastservice.
The cubic model has a good fit as 76.5% so it represents a
better fit for the model
2) (10pt) Given the following estimated regression equation and SPSS output from regression, fill in the
missing values. Show your calculations.
ANOVA
Model Sum of Squares df Mean Square F
1 Regression
Residual
Total 25.5 7
Coefficients
Model
Unstandardized Coefficients
B Std. Error t
1 (Constant) 83.23 1.574 52.882
X1 0.304
X2 1.301 0.321 4.057
Gert said:
Nov 19, 12 at 2:25 amHi All,
How does this calculation work when one of the datasets are percetages?
AKHTAR RASOOL said:
Nov 19, 12 at 4:49 amHow to calculate pearson coefficienr for a line in the graph.
Andale said:
Nov 22, 12 at 5:09 amHi, Justin,
Please post your question on the forums. One of our mods will be able to help you (but please post one question at a time :) ).
Regards,
Stephanie
anum said:
Nov 23, 12 at 2:49 ami am unable to find the correct coefficient of correlation when it gives the negative value in the square root.
Andale said:
Nov 29, 12 at 1:47 pmAnum,
Unfortunately, time constraints prevent me from answering stats related questions on the comments section. But please ask for help on our forums — one of our moderators will be glad to help!
http://www.statisticshowto.com/forums/
Stephanie
Andale said:
Nov 29, 12 at 1:48 pmAhhtar,
Unfortunately, time constraints prevent me from answering stats related questions on the comments section. But please ask for help on our forums — one of our moderators will be glad to help!
http://www.statisticshowto.com/forums/
Stephanie
Andale said:
Nov 29, 12 at 1:48 pmGert,
Unfortunately, time constraints prevent me from answering stats related questions on the comments section. But please ask for help on our forums — one of our moderators will be glad to help!
http://www.statisticshowto.com/forums/
Stephanie
Andale said:
Nov 29, 12 at 1:49 pmHi, Ria,
Unfortunately, time constraints prevent me from answering stats related questions on the comments section. But please ask for help on our forums — one of our moderators will be glad to help!
http://www.statisticshowto.com/forums/
Stephanie
Rifat sheikh said:
Nov 30, 12 at 6:35 amThanks to help
roy omondi said:
Jan 07, 13 at 5:34 amhelpfull an easy to understand
lilian richard said:
Jan 21, 13 at 3:53 pmcan you assist me in choosing the test statistic tools in analyzing my hypotheses such as follows,
1.there is a relationship between m-pesa and the economic and social outcomes in the society.
2.there is a relationship between strategies and approaches used by m-pesa and customer satisfaction.
3.there is relationship between transaction cost and the extent of use of m-pesa
Andale said:
Jan 22, 13 at 6:28 amLilian,
Time constraints prevent me from answering stats questions in the comments…but post on our forums and our mod will be happy to help :)
Stephanie
Co-efficient of variation and correlation co-efficient | learn4kicks said:
Jan 24, 13 at 2:03 pm[...] closer to either -1 or +1 in the whole range of values between both -1 and +1. A worked example at Statistics HowTo and [...]
Derek said:
Feb 14, 13 at 8:29 amThank you so much for the step by step approach. Now if only I get get my college professors to explain things this way!
Correlation made simple using R | My exploration into data analytics said:
Feb 24, 13 at 7:41 am[...] We can use the cor(var1,var2) method to determine the correlation, which will default return the pearsons correlation co-efficient. Now will initially find the correlation between the scores of tamil subject and TotalScores. If you see the below picture we have used the function cor(studentsdata$Tamil, studentsdata$TotalScores) which is returning the value of 0.4370992 which is 43.70% which seems to low positive correlation. We have also tried to plot the data between both the variables using plot. If you wanna learn how to do calculation for correlation please refer to this link for a simple example. [...]
What is the Correlation Coefficient Formula? | said:
Mar 05, 13 at 7:08 am[...] Click here to find out more about Pearson’s correlation coefficient. Click here to find out how to calculate Pearson’s correlation coefficient in easy steps. [...]
Sherry said:
Mar 08, 13 at 1:01 pmWhen constructing the data table, do you use the percent or decimal? For example, x = the number of jobs in a particular state and y = the percent of poverty in that state. Would y = 15.2% or would y = 0.152 for the calculation? Thanks
Andale said:
Mar 11, 13 at 5:44 amSherry,
Use decimal. That makes multiplication possible. For example, if you were to multiply 10% by 10%, you would first have to convert them to decimals anyway (.1 * .1).
Regards,
Stephanie
deno said:
Apr 15, 13 at 12:10 pmFolks,
We are in a grp project for our research class in medical informatics. We are to present the coorelation data in class , any suggestions on how to present this data graphically ? Any software any thing ? Can excel do it ?
Thanks in advance for your help
regards
DenO
f
gaurang said:
Apr 17, 13 at 5:38 amCAN I GET THE LIST OF ALL THE FORMULAS FOR CORRELATION & COEFFICIENTS
grace said:
Apr 20, 13 at 3:37 amThank you so much for the step by step approach.The steps are wonderful adaptable.
Andale said:
Apr 23, 13 at 12:51 pmHi, Joshua,
Thank you for your question. Unfortunately, time constraints prevent me from answering math questions in the comments. Could you post your question on our forums? One of our mods would be glad to help.
Stephanie
Andale said:
Apr 23, 13 at 12:51 pmHi, Guarang,
Thank you for your question. Unfortunately, time constraints prevent me from answering math questions in the comments. Could you post your question on our forums? One of our mods would be glad to help.
Stephanie
Pearson Correlation Coefficient (r) | Intro to Statistical Methods said:
May 10, 13 at 9:04 am[...] This site provides an easy to follow example of how to find the correlation coefficient, although there are multiple ways to do so. If you are curious to see someone complete each step of finding the correlation coefficient, this site is for you!. This site also includes good general information pertaining to the correlation coefficient, and within that site I found this Scatterplot Demonstration which helps reinforce the idea of a strong or weak correlation. Simply click on the various correlation coefficients on the side of the diagram and you can see what different correlations look like. Finally, this site covers a lot of the possible problems that could occur when using the correlation coefficient and is a good resource to know how to react when such problems occur. [...]