Time series > Autoregressive Model

## What is an Autoregressive Model?

An autoregressive (AR) model **predicts future behavior based on past behavior**. It’s used for forecasting when there is some correlation between values in a time series and the values that precede and succeed them. You *only *use past data to model the behavior, hence the name* auto*regressive (the Greek prefix *auto*– means “self.” ). The process is basically a linear regression of the data in the current series against one or more past values in the same series.

In an AR model, the value of the outcome variable (Y) at some point *t* in time is — like “regular” linear regression — directly related to the predictor variable (X). Where simple linear regression and AR models differ is that Y is dependent on X **and previous values **for Y.

The AR process is an example of a stochastic process, which have degrees of uncertainty or randomness built in. The randomness means that you might be able to predict future trends pretty well with past data, but you’re never going to get 100 percent accuracy. Usually, the process gets “close enough” for it to be useful in most scenarios.

AR models are also called conditional models, Markov models, or transition models.

## AR(p) Models

An AR(p) model is an autoregressive model where specific lagged values of y_{t} are used as predictor variables. Lags are where results from one time period affect following periods.

**The value for “p” is called the order.** For example, an AR(1) would be a “first order autoregressive process.” The outcome variable in a first order AR process at some point in time

*t*is related only to time periods that are one period apart (i.e. the value of the variable at t – 1). A second or third order AR process would be related to data two or three periods apart.

**The AR(p) model is defined by the equation:**

y_{t} = δ + φ_{1}y_{t-1} + φ_{2}y_{t-2} + … + φ_{p}y_{t-1} + A_{t}

Where:

- y
_{t-1}, y_{t-2}…y_{t-p}are the past series values (lags), - A
_{t}is white noise (i.e. randomness), - and δ is defined by the following equation:

**Need help with a homework or test question?** Chegg offers 30 minutes of free tutoring, so you can try them out before committing to a subscription. Click here for more details.

If you prefer an **online interactive environment** to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

*Facebook page*.