How to Compute Pearson’s Correlation Coefficients

Correlation coefficients are used in statistics to measure how strong a relationship is between two variables. There are several types of correlation coefficient: Pearson’s correlation or Pearson correlation is a correlation coefficient commonly used in linear regression. If you like our easy to follow explanations of statistics, check out our easy to follow book, which has hundreds more examples, just like this one.

Sample question: compute the value of the correlation coefficient from the following table:

Subject Age x Glucose Level y
1 43 99
2 21 65
3 25 79
4 42 75
5 57 87
6 59 81

Step 1:Make a chart. Use the given data, and add three more columns: xy, x2, and y2.

Subject Age x Glucose Level y xy x2 y2
1 43 99
2 21 65
3 25 79
4 42 75
5 57 87
6 59 81

Step 2::Multiply x and y together to fill the xy column. For example, row 1 would be 43 × 99 = 4,257.

Subject Age x Glucose Level y xy x2 y2
1 43 99 4257
2 21 65 1365
3 25 79 1975
4 42 75 3150
5 57 87 4959
6 59 81 4779

Step 3: Take the square of the numbers in the x column, and put the result in the x2 column.

Subject Age x Glucose Level y xy x2 y2
1 43 99 4257 1849
2 21 65 1365 441
3 25 79 1975 625
4 42 75 3150 1764
5 57 87 4959 3249
6 59 81 4779 3481

Step 4: Take the square of the numbers in the y column, and put the result in the y2 column.

Subject Age x Glucose Level y xy x2 y2
1 43 99 4257 1849 9801
2 21 65 1365 441 4225
3 25 79 1975 625 6241
4 42 75 3150 1764 5625
5 57 87 4959 3249 7569
6 59 81 4779 3481 6561

Step 5: Add up all of the numbers in the columns and put the result at the bottom.2 column. The Greek letter sigma (Σ) is a short way of saying “sum of.”

Subject Age x Glucose Level y xy x2 y2
1 43 99 4257 1849 9801
2 21 65 1365 441 4225
3 25 79 1975 625 6241
4 42 75 3150 1764 5625
5 57 87 4959 3249 7569
6 59 81 4779 3481 6561
Σ 247 486 20485 11409 40022

Step 6:Use the following formula to work out the correlation coefficient.
pearsons correlation coefficient

The answer is: 2868 / 5413.27 = 0.529809

Click here if you want easy, step-by-step instructions for solving this formula.

From our table:

  • Σx = 247
  • Σy = 486
  • Σxy = 20,485
  • Σx2 = 11,409
  • Σy2 = 40,022
  • n is the sample size, in our case = 6

so the correlation coefficient =

  • 6(20,485) – (247 × 486) / [√[[6(11,409) - (2472)] × [6(40,022) - 4862]]]
  • =0.5298

The range of the correlation coefficient is from -1 to 1. Since our result is 0.5298 or 52.98%, which means the variables have a moderate positive correlation.

Like the explanation? Check out our statistics how-to book, with a how-to for every elementary statistics problem type.

Feel like Cheating at Statistics? This is the Statistics Handbook that your professor doesn't want you to see. So easy, it's Practically Cheating. Find out more »

64 Responses to “How to Compute Pearson’s Correlation Coefficients”

  1. How to Do Everything Statistics » How to Test for Correlation Coefficient said:

    Nov 05, 09 at 10:45 am

    [...] coefficient. Sample question: test the significance of the correlation coefficient r=0.565 (How to calculate a correlation coefficient) using the critical values for PPMC table. Test at α=0.01 for a sample size of 9. Step 1: [...]

  2. How to Do Everything Statistics » How to Find the Coefficient of Determination said:

    Nov 05, 09 at 12:00 pm

    [...] values. Finding the coefficient of determination takes only three steps! Step 1: Find the correlation coefficient, r (it may be given to you in the question). Example, [...]

  3. How to Do Everything Statistics » How to Find a Linear Regression Equation said:

    Nov 05, 09 at 12:38 pm

    [...] When a correlation coefficient shows that data is likely to be able to predict future outcomes, statisticians use linear regression to find a predictive function. If you recall from elementary algebra, the equation for a line is y=mx+b. This article shows you how to take data, calculate linear regression, and find the equation y’=a+bx. Step 1: Make a chart of your data, filling in the columns in the same way as you would fill in the chart if you were finding the Pearson’s Correlation Coefficient. [...]

  4. Jennifer Thomas said:

    Nov 08, 09 at 8:28 pm

    I’m a little confused. Based on the formula, I thought that instead of squaring 114092 and 40022, you should square x (247)and y (486).

  5. Bill Bryan said:

    Dec 01, 09 at 11:22 am

    I think this is the part of the course that you can feel your brain growing larger. The Correlation Coefficient equation is a long process, if only there was a way to shorten the problem.

  6. Donna Allen said:

    Dec 02, 09 at 7:42 pm

    I too wish there was a shorter way to do this problem. I’m just thankful that I actually understand how to work the problem. Your explanation was helpful and easy to follow. Thank you!

  7. Vanessa said:

    Dec 03, 09 at 9:57 pm

    This example was really helpful and I understand how to calculate the problem and how to do all of the steps but the only problem I am having is how did you get the final answer which in your example it says 1.44281 … in mathzone I did the whole problem like it said and I even saw the example and it was right but the final answer I dont know how they got to. i got 2.14866937 E -4 , but the answer was 0.947. please help me .. im I missing something?

  8. Vanessa said:

    Dec 04, 09 at 12:54 am

    I understand, I just figured out the right answer now. I didnt know I had to square root the bottom part, and even though this helped me alot, i used google and they helped me figure out the last part by explaining everything step by step and unfortunately thats what i need.

  9. Alison Bryant said:

    Apr 26, 10 at 12:19 pm

    I have found that it is easiest, and you get the same answer by going through the Linreg function on the calculator, it gives you the correlation coefficent as well as the correlation of determination.

  10. Tony said:

    Apr 07, 11 at 4:10 am

    Excellent example. A couple of mistakes though! 6×11409 = 68454
    Also you must take the square-root of the denominator. I make the answer 0.5298

  11. Ronak said:

    Aug 02, 11 at 11:22 am

    On this page you showed that r’s denominator is a square root

    http://www.statisticshowto.com/articles/how-to-compute-pearsons-correlation-coefficients/

    but on this page, you didn’t do it.

    http://www.statisticshowto.com/help-with-statistics-equations/

    Also, step 5 is wrong.
    6 * 11409 = 68,454 not 66,294

    In step 7, you used 68,454 which is correct but when you subtracted 61009 from it you got the incorrect value of 5,285.

    68,454 – 61,009 = 7,445

    In step 11, it becomes 7,445 * 3936 = 29303520.

    The final answer should be 2896 / 29303520 = 9.78722e-05 = 0.000097822

    Regards,

  12. nick bhullar said:

    Oct 09, 11 at 6:14 pm

    you are absolutely correct

  13. ARCHANA said:

    Oct 15, 11 at 4:20 am

    ITS GOOD,EASY TO UNDERSTAND

  14. Stephanie said:

    Oct 27, 11 at 9:09 pm

    Thanks for spotting the error in the formula! An update is on the way for the long step from the book. In the meantime, this page has been updated with the correct answer (thanks, Tony!).

  15. Seema Dessai said:

    Dec 22, 11 at 9:59 pm

    This explanation i personally found to be the best after going through many explanation based on the same formula.Thank you very much for such a simple and understanding method of explanation of Pearson’s Correlation Coefficients.

  16. Statistics How To» Blog Archive » How to Find a Linear Regression Slope said:

    Jan 03, 12 at 11:15 am

    [...] If you don’t remember how to get those variables from data, see this article on how to find a Pearson’s correlation coefficient. Follow the steps there to create a table and find Σx, Σy, Σxy, Σx2, and [...]

  17. Molly said:

    Apr 19, 12 at 11:45 am

    wow, thank you so much, the steps are wonderfully helpful and adaptable

  18. paresh said:

    May 27, 12 at 9:10 pm

    A B C D E F G H
    10.8 49.9 99.65 200.8 1.004 5.005 10.004 50.01
    10.8 49.9 99.65 200.8 1.004 5.005 10.004 50.01
    10.8 49.9 99.65 200.8 1.004 5.005 10.004 50.01
    AVg
    10.8 49.9 99.65 200.8 1.004 5.005 10.004 50.01
    How to calculate corelation coefficient of above .

  19. Habiba Abdi said:

    Jun 12, 12 at 4:28 am

    kindly assist on how to calculate the correlation coefficient step by step

  20. jonathan D Hantapat said:

    Jul 22, 12 at 7:20 am

    Thank you so much for the easy self explanatory examples given. I really appreciate it. I was given an assignment for analytical chemistry on statistics and so thankful that this website help solve my 75% of my assignment.

  21. Andale said:

    Jul 22, 12 at 7:47 am

    Great! That’s what the site is here for. It’s always nice to hear it helped out :)

    Stephanie

  22. shawty said:

    Aug 13, 12 at 4:17 am

    perfect example.has great step by step guidance which makes it very easy to understand

  23. Genardo_27 said:

    Aug 28, 12 at 10:00 pm

    will some one help us how to solve genetic correlation problems? it is our report yet were not ready>> we are totally dead to our very owned teacher… S O S..

    pls. add link to my f.b page. genard_perias27@yahoo.com

    thank you!

  24. Erika P said:

    Sep 12, 12 at 10:18 pm

    Super easy to follow, loved the whole “step-by-step” thing! I’m seriously mathematically challenged, so I was super happy to have found such a helpful website!

  25. Isaac Zaji said:

    Sep 16, 12 at 10:17 am

    Your steps of calculating correlation coefficient is wonderful

  26. Andale said:

    Sep 17, 12 at 5:04 am

    Thanks, Isaac!

  27. Pixie said:

    Sep 21, 12 at 11:50 am

    Must bookmark this site! Extremely helpful in taking online classes and trying to teach myself statistics. I kept getting numbers like 38 for r until I read this article v.v

  28. Bharat prajapati said:

    Sep 30, 12 at 1:38 am

    It is very helpful website.We can easy to understand our question from this website.

  29. Abel said:

    Oct 23, 12 at 4:25 am

    thanks your steps makes it easy to understand!

  30. kobby maloiso said:

    Oct 30, 12 at 2:34 am

    help me to answer this please its urgent:(a researcher correlated the MTAI scores of a group of 100 experienced secondary school teachers with the number of students each teacher failed in a year.He obtained an r of -0.39.He concluded that teachers tend to fail students because they do not have “accepting”attitudes towards students. Comment on the researcher’s methods and conclusions.

  31. Andale said:

    Oct 30, 12 at 11:03 am

    Hello, Kobby,

    Please ask your question on the forum and one of our mods will get back to you:
    http://www.statisticshowto.com/forums/

    Thanks!
    Stephanie

  32. CW said:

    Nov 01, 12 at 10:15 am

    Remember to get the square root of the denominator before dividing!

    The formala as shown should show:

    6(20,485) – (247 × 486) /

    [ [6(11,409) - (2472)] × [6(40,022) - 4862]] <– square root!

    =0.5298

  33. ZogaraUmmey Hassan said:

    Nov 08, 12 at 3:04 am

    Excellent. Help me a lot to find out correlation coefficients.

  34. MARIA said:

    Nov 12, 12 at 10:03 am

    X Y
    7 9
    8 11
    12 12
    4 13
    16 15
    18 17
    10 18
    FIND COEFFICIENT OF CORRELATION BY PEARSON METHOD PLEASE?

  35. Andale said:

    Nov 12, 12 at 1:29 pm

    Hi, Maria,

    Would you mind posting this in our forum? One of our mods would be happy to help :)

    http://www.statisticshowto.com/forums/

    Thanks,
    Stephanie

  36. MARIA said:

    Nov 13, 12 at 10:06 pm

    thanx Stephanie. To share something more ,can u mail at
    qaseem65@gmail.com

  37. MARIA said:

    Nov 13, 12 at 10:09 pm

    Hi kobby
    To me,teachers r right

  38. MARIA said:

    Nov 13, 12 at 10:15 pm

    Can someone help me tell about level of significance at 0.05/0.01 from Chai square table,with reference to hopothesis H0/HA ?? thx

  39. Ria said:

    Nov 15, 12 at 11:32 am

    Hi everyone, help please.
    I am very new to stats but need to grasp it quickly to analyze data in my thesis.

    I want to find out the nature of relationship between pop love songs themes and imagined interactions (Imagined Interaction Theory. The instrument to measure imagined interactions is a 7point interval scale ranging from strongly disagree to strongly agree.

    Can I use Pearson’s R to test the coefficient between my variables: love songs themes and imagined interactions?

    Thanks in advance for help

  40. Justin said:

    Nov 15, 12 at 1:12 pm

    can anyone help me do this stat homework?
    1) Watson & Watson Repair Inc. provides maintenance service for a large apartment complex in downtown
    Saint Petersburg, Florida. W & W managers are evaluating the possibility of hiring another maintenance
    person because it seems maintenance calls are increasing. Rafael Roddick and Andy Nadal are currently
    responsible for maintenance tasks. To investigate “what” drives Repair Time, the managers hire you as
    statistician to conduct a regression analysis. The table below provides data from a random selected sample of
    10 maintenance calls.
    a. (1pt)How would you include “responsible for maintenance” in your regression? (How would you define
    it?)
    STEP 1: use the dummy variable REPAIRPERSON = 1 IF responsible = RAFAEL
    REPAIRPERSON = 0 IF responsible = ANDY
    A regression model is set up using ONLY repairperson as variable to explain REPAIRTIME
    a. (1pt) Comment on the “correlation” between Repairperson and Repairtime.
    b. (1pt) Comment on goodness of fit of the model

    Maintenance
    Call
    Repair
    Time
    (hours)
    Months
    Since
    Last Service
    Responsible for
    maintenance
    1 2.9 3 Rafael Roddick
    2 3 3.9 Rafael Roddick
    3 4.8 8.2 Andy Nadal
    4 1.8 3 Rafael Roddick
    5 2.9 2 Rafael Roddick
    6 4.9 7 Andy Nadal
    7 4.4 9 Andy Nadal
    8 4.5 8.5 Andy Nadal
    9 4.4 4 Andy Nadal
    10 4.5 6 Rafael Roddick
    Correlations
    Repairtime Repair
    Pearson
    Correlation
    Repairtime 1.000 -.783
    Repairperson -.783 1.000
    Model Summary
    b
    Model R R Square
    Adjusted R
    Square
    Std. Error of the
    Estimate
    1 .783
    a
    .614 .565 .70071
    a. Predictors: (Constant), Repairperson
    b. Dependent Variable: Repairtime2
    c. (2pt) Report the statistical significance of the coefficients.
    STEP 2: Use MONTHS SINCE LAST SERVICE AND REPAIRPERSON in a regression to explain REPAIRTIME
    a. (1pt) Comment on the scatter diagram for Months-since-last-service and Repairtime.
    b. (2pt) Comment on goodness of fit of the model. Do you find any difference with respect to the goodness of
    fit of the model in STEP 1?
    c. (1pt) Comment on the normality assumptions of the model.
    Coefficients
    a
    Model
    Unstandardized Coefficients
    Standardized
    Coefficients
    B Std. Error Beta t Sig.
    1 (Constant) 4.600 .313 14.679 .000
    Repairperson -1.580 .443 -.783 -3.565 .007
    a. Dependent Variable: Repairtime
    Model Summary
    b
    Model R R Square Adjusted R Square
    Std. Error of the
    Estimate
    1 .839
    a
    .705 .620 .65498
    a. Predictors: (Constant), monthslastservice, Repairperson
    b. Dependent Variable: Repairtime3
    d. (3pt) Report the statistical significance of the coefficients.

    e. (1pt) Why do you think the statistical significance of the coefficient for repairperson has changed from step 1
    to step 2?
    In Step 1, repairperson was the only variable explaining repair time. It seems that the
    combining this variable with months since last service the, repairperson loses explanatory
    power, which is reflected in the SS of the coefficient.
    f. (1pt) Write down the estimated regression equation.

    g. (2pt) Interpret the intercept for this model
    h. (2pt) Provide an interpretation for the slope coefficients of the model.
    Coefficients
    a
    Unstandardized Coefficients
    Standardized
    Coefficients
    B Std. Error Beta t Sig.
    (Constant) 3.195 1.001 3.192 .015
    Repairperson -.860 .642 -.426 -1.340 .222
    monthslastservice .191 .130 .467 1.468 .18
    STEP 3: Use MONTHS SINCE LAST SERVICE to capture the curvature explaining REPAIRTIME
    1) (2pt) From all models bellow, which you think is best?
    Model Summary and Parameter Estimates
    Dependent Variable:Repairtime
    Equation
    Model Summary Parameter Estimates
    R Square F df1 df2 Sig. Constant b1 b2 b3
    Linear .629 13.558 1 8 .006 2.036 .325
    Quadratic .709 8.531 2 7 .013 .213 1.130 -.072
    Cubic .765 6.515 3 6 .026 3.639 -1.227 .405 -.029
    The independent variable is monthslastservice.
    The cubic model has a good fit as 76.5% so it represents a
    better fit for the model
    2) (10pt) Given the following estimated regression equation and SPSS output from regression, fill in the
    missing values. Show your calculations.

    ANOVA
    Model Sum of Squares df Mean Square F
    1 Regression
    Residual
    Total 25.5 7
    Coefficients
    Model
    Unstandardized Coefficients
    B Std. Error t
    1 (Constant) 83.23 1.574 52.882
    X1 0.304
    X2 1.301 0.321 4.057

  41. Gert said:

    Nov 19, 12 at 2:25 am

    Hi All,

    How does this calculation work when one of the datasets are percetages?

  42. AKHTAR RASOOL said:

    Nov 19, 12 at 4:49 am

    How to calculate pearson coefficienr for a line in the graph.

  43. Andale said:

    Nov 22, 12 at 5:09 am

    Hi, Justin,

    Please post your question on the forums. One of our mods will be able to help you (but please post one question at a time :) ).

    Regards,
    Stephanie

  44. anum said:

    Nov 23, 12 at 2:49 am

    i am unable to find the correct coefficient of correlation when it gives the negative value in the square root.

  45. Andale said:

    Nov 29, 12 at 1:47 pm

    Anum,

    Unfortunately, time constraints prevent me from answering stats related questions on the comments section. But please ask for help on our forums — one of our moderators will be glad to help!

    http://www.statisticshowto.com/forums/

    Stephanie

  46. Andale said:

    Nov 29, 12 at 1:48 pm

    Ahhtar,

    Unfortunately, time constraints prevent me from answering stats related questions on the comments section. But please ask for help on our forums — one of our moderators will be glad to help!

    http://www.statisticshowto.com/forums/

    Stephanie

  47. Andale said:

    Nov 29, 12 at 1:48 pm

    Gert,

    Unfortunately, time constraints prevent me from answering stats related questions on the comments section. But please ask for help on our forums — one of our moderators will be glad to help!

    http://www.statisticshowto.com/forums/

    Stephanie

  48. Andale said:

    Nov 29, 12 at 1:49 pm

    Hi, Ria,

    Unfortunately, time constraints prevent me from answering stats related questions on the comments section. But please ask for help on our forums — one of our moderators will be glad to help!

    http://www.statisticshowto.com/forums/

    Stephanie

  49. Rifat sheikh said:

    Nov 30, 12 at 6:35 am

    Thanks to help

  50. roy omondi said:

    Jan 07, 13 at 5:34 am

    helpfull an easy to understand

  51. lilian richard said:

    Jan 21, 13 at 3:53 pm

    can you assist me in choosing the test statistic tools in analyzing my hypotheses such as follows,

    1.there is a relationship between m-pesa and the economic and social outcomes in the society.

    2.there is a relationship between strategies and approaches used by m-pesa and customer satisfaction.

    3.there is relationship between transaction cost and the extent of use of m-pesa

  52. Andale said:

    Jan 22, 13 at 6:28 am

    Lilian,

    Time constraints prevent me from answering stats questions in the comments…but post on our forums and our mod will be happy to help :)

    Stephanie

  53. Co-efficient of variation and correlation co-efficient | learn4kicks said:

    Jan 24, 13 at 2:03 pm

    [...] closer to either -1 or +1 in the whole range of values between both -1 and +1. A worked example at Statistics HowTo  and [...]

  54. Derek said:

    Feb 14, 13 at 8:29 am

    Thank you so much for the step by step approach. Now if only I get get my college professors to explain things this way!

  55. Correlation made simple using R | My exploration into data analytics said:

    Feb 24, 13 at 7:41 am

    [...] We can use the cor(var1,var2) method to determine the correlation, which will default return the pearsons correlation co-efficient. Now will initially find the correlation between the scores of tamil subject and TotalScores. If you see the below picture we have used the function cor(studentsdata$Tamil, studentsdata$TotalScores) which is returning the value of 0.4370992 which is 43.70% which seems to low positive correlation. We have also tried to plot the data between both the variables using plot. If you wanna learn how to do calculation for correlation please refer to this link for a simple example. [...]

  56. What is the Correlation Coefficient Formula? | said:

    Mar 05, 13 at 7:08 am

    [...] Click here to find out more about Pearson’s correlation coefficient. Click here to find out how to calculate Pearson’s correlation coefficient in easy steps. [...]

  57. Sherry said:

    Mar 08, 13 at 1:01 pm

    When constructing the data table, do you use the percent or decimal? For example, x = the number of jobs in a particular state and y = the percent of poverty in that state. Would y = 15.2% or would y = 0.152 for the calculation? Thanks

  58. Andale said:

    Mar 11, 13 at 5:44 am

    Sherry,

    Use decimal. That makes multiplication possible. For example, if you were to multiply 10% by 10%, you would first have to convert them to decimals anyway (.1 * .1).

    Regards,
    Stephanie

  59. deno said:

    Apr 15, 13 at 12:10 pm

    Folks,

    We are in a grp project for our research class in medical informatics. We are to present the coorelation data in class , any suggestions on how to present this data graphically ? Any software any thing ? Can excel do it ?

    Thanks in advance for your help

    regards
    DenO
    f

  60. gaurang said:

    Apr 17, 13 at 5:38 am

    CAN I GET THE LIST OF ALL THE FORMULAS FOR CORRELATION & COEFFICIENTS

  61. grace said:

    Apr 20, 13 at 3:37 am

    Thank you so much for the step by step approach.The steps are wonderful adaptable.

  62. Andale said:

    Apr 23, 13 at 12:51 pm

    Hi, Joshua,
    Thank you for your question. Unfortunately, time constraints prevent me from answering math questions in the comments. Could you post your question on our forums? One of our mods would be glad to help.
    Stephanie

  63. Andale said:

    Apr 23, 13 at 12:51 pm

    Hi, Guarang,
    Thank you for your question. Unfortunately, time constraints prevent me from answering math questions in the comments. Could you post your question on our forums? One of our mods would be glad to help.
    Stephanie

  64. Pearson Correlation Coefficient (r) | Intro to Statistical Methods said:

    May 10, 13 at 9:04 am

    [...] This site provides an easy to follow example of how to find the correlation coefficient, although there are multiple ways to do so. If you are curious to see someone complete each step of finding the correlation coefficient, this site is for you!. This site also includes good general information pertaining to the correlation coefficient, and within that site I found this Scatterplot Demonstration which helps reinforce the idea of a strong  or weak correlation. Simply click on the various correlation coefficients on the side of the diagram and you can see what different correlations look like. Finally, this site covers a lot of the possible problems that could occur when using the correlation coefficient and is a good resource to know how to react when such problems occur. [...]