Probability and Statistics Topic Index > Statistics Definitions > Simple Linear Regression

If you’re just beginning to learn about regression analysis, a simple linear is the first type of regression you’ll come across in a stats class.

Linear regression is the most **widely used statistical technique**; it is a way to model a relationship between two sets of variables. The result is a linear regression equation that can be used to make predictions about the data. The equation produced is in the form **y=ax+b**, which is the **slope formula**, where: a is the y-intercept and b is the slope of the line.

Most software packages and calculators can calculate linear regression. For example:

You can also Find a linear regression by hand.

Before you try your calculations, you should always make a scatter plot to see if your data roughly fits a line. **Why?** Because regression will *always *give you an equation, and it may not make any sense if your data is scattered exponentially.

### Etymology

“Linear” means line. The word *Regression* came from a 19th-Century Scientist, Sir Francis Galton, who coined the term “regression toward mediocrity” (in modern language, that’s regression toward the mean). He used the term to describe the phenomenon of how nature tends to dampen excess physical traits from generation to generation (like extreme height).

### Why use Linear Relationships?

Linear relationships, i.e. lines, are easier to work with and most phenomenon are naturally linearly related. If variables *aren’t* linearly related, then some math can transform that relationship into a linear one, so that it’s easier for the researcher (i.e. you) to understand.

## What is Simple Linear Regression?

You’re probably familiar with plotting line graphs with one X axis and one Y axis. The X variable is sometimes called the independent variable and the Y variable is called the dependent variable. Simple linear regression plots one independent variable X against one dependent variable Y. Technically, in regression analysis, the independent variable is usually called the predictor variable and the dependent variable is called the criterion variable. However, many people just call them the independent and dependent variables. More advanced regression techniques (like multiple regression) use multiple independent variables.

Regression analysis can result in *linear *or *nonlinear* graphs. A linear regression is where the relationships between your variables can be described with a straight line. Non-linear regressions produce curved lines.(^{**})

Regression analysis is almost always performed by a computer program, as the equations are extremely time-consuming to perform by hand. The following video explains how to find a simple linear regression equation by hand:

Simple linear regression in Microsoft Excel is a **lot** easier, as Excel performs all of those tedious calculations for you. This video walks you through the steps of finding a simple linear regression equation in Microsoft Excel:

If you’d prefer to read a text version of the videos, hop over to the next article:

How to Find a Linear Regression Equation

**As this is an introductory article, I kept it simple. But there’s actually an important technical difference between linear and nonlinear, that will become more important if you continue studying regression. For details, see the article on nonlinear regression.

**Questions**? Post a comment and I’ll do my best to help!

Check out our YouTube channel for hundreds of introductory stat videos including Excel and the TI-Calculators.

Is linear regression the test to use for continuous variables?

Yes, you can use it for continuous variables :)