Statistics How To

Line of Best Fit: What it is, How to Find it

Probability and Statistics Index > Regression Analysis > Line of best fit

What is the Line of Best Fit?

The line of best fit (or trendline) is an educated guess about where a linear equation might fall in a set of data plotted on a scatter plot. Trend lines are usually plotted with software, as once you’ve got more than a few points on a piece of paper, it can be difficult to determine where that line of best fit might be.

This handy applet from Illinois State University is free and allows you to plot a series of points (up to 10) and find the line of best fit. This first graph I made was for the points:
(1,2)
(2,3)
(3,4)
(4,5)
(5,6)
line of best fit
Not surprisingly, the line of best fit traveled through the center of the five dots.

Look what happens when one of the points is moved down:
line of best fit 2
The line of best fit drops slightly lower. That’s because the dropped point acts like gravity, pulling the best fit line downward.

Caution!

Just because you get a line of best fit, doesn’t mean that it makes sense. Take this set of unrelated (scattered) data points. If you look at the points by themselves, there clearly isn’t any kind of trend. But the software will give you a guesstimate anyway.
line of best fit 3
You should always plot your data on a scatter plot before you get your line of best fit, and eyeball your graph to see if a linear equation makes sense for your data. It’s possible to find non-linear lines of best fit (like polynomial lines of best fit), but if you’ve got completely random data, it’s possible that the line of best fit is going to be a pretty awful guesstimate.

Equation for the Line of Best Fit

Our online linear regression calculator will give you an equation to go with your data. For example, the first graph above gives the equation y = 1 + 1x. If you graph this equation on a graphing calculator (such as this one), you’ll see that the line matches perfectly with the line in the first image above. You can find a linear regression by hand, but I wouldn’t recommend it as the process is very tedious and it’s easy for errors to slip in.

A line of best fit is usually found through Simple Linear Regression. The following software programs can perform linear regression (and most other types of regression analysis):

Types of Trendline

Linear Trendline

This is a good choice when a set of data points appear to be following a straight line. The line is the line of best fit; a straight line that’s a good approximation of the data.

linear trendline

Stock prices showing an upward movement.



Polynomial Trend line

A polynomial trend line has a series of curves and bumps. In the real world, data usually follows a polynomial trendline (as opposed to a linear trendline, which is rarely seen).
polynomial trendline

The image above shows a polynomial with one curve (a parabola); this is called a second degree polynomial. Data points with a series of bumps and curves can be fitted to third degree and higher polynomials.

Exponential Trendline

An exponential line can show exponential growth or exponential decay. It’s useful when data points grow (or fall) at extremely fast rates.
exponential trendline

Calculating a Trendline

While a linear line is computed by minimizing the squared distances from the line to the points (a method called “least squares” fitting), there are multiple ways to create an exponential lines. Trendlines aren’t usually calculated by hand, as the process is tedious and lengthy. Nearly all spreadsheet and statistical programs have an option for trendlines.
Adding a Trendline in Excel
Excel can add a variety of trend lines to your data points including exponential, linear, logarithmic, and polynomial. For instructions on how to create a scatter plot of your data and add a trendline (includes video), see: How to Make a Scatter Plot in Excel

,

------------------------------------------------------------------------------

If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.

Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!
Line of Best Fit: What it is, How to Find it was last modified: November 13th, 2017 by Stephanie Glen