Regression analysis is used to find equations that fit data. Once we have the equation, we can use the statistical model to make predictions. One type of regression analysis is linear analysis. When a correlation coefficient shows that data is likely to be able to predict future outcomes and a scatter graph of the data appears to form a straight line, statisticians may use linear regression to find a predictive function. If you recall from elementary algebra, the equation for a line is y = mx + b. This article shows you how to take data, calculate linear regression, and find the equation y’ = a + bx. Note: If you’re taking AP statistics, you may see the equation written as b0 + b1x, which is the same thing (you’re just using the variables b0 + b1 instead of a + b.
Step 1: Make a chart of your data, filling in the columns in the same way as you would fill in the chart if you were finding the Pearson’s Correlation Coefficient.
|Subject||Age x||Glucose Level y||xy||x2||y2||1||43||99||4257||1849||9801|
From the above table, Σx = 247, Σy = 486, Σxy = 20485, Σx2 = 11409, Σy2 = 40022. n is the sample size (6, in our case).
a = 65.1416
- ((486 × 11,409) – ((247 × 20,485)) / 6 (11,409) – 2472)
- 484979 / 7445
- (6(20,485) – (247 × 486)) / (6 (11409) – 2472)
- (122,910 – 120,042) / 68,454 – 2472
- 2,868 / 7,445
- = .385225
Step 3: Insert the values into the equation.
y’ = a + bx
y’ = 65.14 + .385225x
If you like our easy to follow explanations of statistics, check out our easy to follow book, which has hundreds more examples, just like this one.
* Note that this example has a low correlation coefficient, and therefore wouldn’t be too good at predicting anything.