The ARIMA (Autoregressive Integrated Moving Average) model is a popular time series forecasting model that combines three main components: Autoregressive (AR), Integrated (I), and Moving Average (MA). Here's a breakdown of each component:
Autoregressive (AR) Component:
- The AR component captures the relationship between an observation and a certain number of lagged observations.
- It assumes that the current value of the time series depends on its own previous values.
- The order of the AR component, denoted as "p", represents the number of lag observations included in the model.
- For example, an AR(1) model means that the current value is dependent on the immediately preceding value, while an AR(2) model depends on the last two values, and so on.
Integrated (I) Component:
- The Integrated component deals with the differencing of the time series.
- Differencing is performed to remove the trend and make the time series stationary.
- The order of differencing, denoted as "d", represents the number of times the time series needs to be differenced to achieve stationarity.
- For example, an ARIMA model with d=1 means that the first-order difference of the time series is used, while d=2 indicates that the second-order difference is used.
Moving Average (MA) Component:
- The MA component captures the relationship between an observation and the residual errors from a moving average model applied to lagged observations.
- It models the error of the time series as a linear combination of error terms from previous time points.
- The order of the MA component, denoted as "q", represents the number of lagged forecast errors in the model.
- For example, an MA(1) model means that the current value is dependent on the immediately preceding forecast error, while an MA(2) model depends on the last two forecast errors, and so on.
The ARIMA model is specified as ARIMA(p, d, q), where:
- p is the order of the AR component
- d is the degree of differencing
- q is the order of the MA component
For example, an ARIMA(1, 1, 2) model includes an AR component of order 1, first-order differencing, and an MA component of order 2.
The ARIMA model assumes that the time series is stationary or can be made stationary through differencing. If the time series exhibits seasonal patterns, the Seasonal ARIMA (SARIMA) model can be used, which incorporates additional seasonal terms.
To determine the appropriate orders (p, d, q) for an ARIMA model, various techniques can be employed, such as:
- Plotting the autocorrelation function (ACF) and partial autocorrelation function (PACF) to identify significant lags.
- Using information criteria like Akaike Information Criterion (AIC) or Bayesian Information Criterion (BIC) to select the best model order.
- Performing grid search or automated model selection techniques to find the optimal combination of p, d, and q.
Once the ARIMA model is specified and fitted to the time series data, it can be used to make forecasts for future time points based on the patterns and relationships captured by the AR, I, and MA components.