Forward selection is a type of stepwise regression which begins with an empty model and adds in variables one by one. In each forward step, you add the one variable that gives the single best improvement to your model.

It is one of two commonly used methods of stepwise regression; the other is backward elimination, and is almost opposite. In that, you start with a model that includes every possible variable and eliminate the extraneous ones one by one.

## General Method Behind Forward Selection

Forward selection typically begins with only an intercept. One tests the various variables that may be relevant, and the ‘best’ variable — where best is determined by some pre-determined criteria– is added to the model.

As the model continues to improve (per that same criteria) we continue the process, adding in one variable at a time and testing at each step. Once the model no longer improves with adding more variables, the process stops.

The criterion used to determine which variable goes in when are varied. You could be attempting to find the lowest score under cross validation, the lowest p-value, or any of a number of other tests or measures of accuracy.

Since stepwise regression tends toward over-fitting, it is usually good to have strict criteria for adding in any variables. (Overfitting happens when we put in more variables than is actually good for the model; it typically shows a very close, neat fit of the data used in regression, but the model will be far off from additional data points and not good for interpolation).

## References

Shalizi, Cosma. Lecture 26: Variable Selection. Modern Regression for Undergraduates Class Notes.

http://www.stat.cmu.edu/~cshalizi/mreg/15/lectures/26/lecture-26.pdf

Brant, Rollin. Forward Selection. MDSC 643.02 Lecture Materials. Retrieved from

https://www.stat.ubc.ca/~rollin/teach/643w04/lec/node41.html on July 7, 2018

SAS Support. Forward Selection. The GLMSELECT Procedure. Retrieved from http://support.sas.com/documentation/cdl/en/statug/66859/HTML/default/viewer.htm#statug_glmselect_details03.htm on July 8, 2018.

Cook, Perry. Stepwise Selection. Human-Computer Interface Technology (CS436) Class Notes. Retrieved from

https://www.cs.princeton.edu/courses/archive/fall08/cos436/Duda/FS/stepwise.htm on July 8, 2018.

**Need help with a homework or test question?** With Chegg Study, you can get step-by-step solutions to your questions from an expert in the field. If you'd rather get 1:1 study help, Chegg Tutors offers 30 minutes of **free tutoring** to new users, so you can try them out before committing to a subscription.

If you prefer an **online interactive environment** to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

**Comments? Need to post a correction?** Please post a comment on our *Facebook page*.

Check out our updated Privacy policy and Cookie Policy