Time Series SARIMA

Why differentiation works

  • For time series X_t = \beta_0+\beta_1t+W_t :
    • E[X_t] = \beta_0 +\beta_1t + E[W_t] and it is dependent on t
    • So X is not stationary
  • Let Y_t = X_t-X_{t-1} = ... =\beta_1+W_t-W_{t-1} and it is a MA(1) model
    • So Y is stationary
  • By using a difference, the nonstationary shits can be trandformed to stationary shits
  • Formal shits:
    • backshift operator
    • Βx_t:=x_{t-1}B^kx_t = x_{t-k}
    • difference of order d
    • d : (1-B)^d
      • d=1: Y_t = X_t-X_{t-1} = (1-B)X_t
      • d=2: Y_t = (1-B)^2X_t = X_t-2X_{t-1}+X_{t-2}

ARIMA model

  • steps:
    • Assume X_t is not stationary, and by using Y_t = (1-B)^dX_t, it become stationary
    • Fit an ARMA(p, q) on Y_t for forcasting, which is equivalent to fitting an ARIMA(p,d,q) model to X_t
    • Use forecasts from Y_t with linear transofrmation to predict X_t values
  • example:
    • Suppose Y_t=(1-B)X_t
    • Forecast Y_t by using an ARMA model \hat Y_{n+1}, \hat Y_{n+2}, ...
    • Predict X_t as: \hat X_{n+1} = X_n + \hat Y_{n+1}, \hat X_{n+2} = \hat X_n+1 + \hat Y_{n+2}, ...
  • Ljung-Box test:
    • H0: Model does not exihibit lack of fit
    • Ha: Model exhibits lack of fit
    • idea is to look at AC residuals (actual-predicted), if they exhibit correlation, then the model has not fully captured the underlying strutcure of the TS data
  • Forecasting accuracy metrics:
    • MAE = \frac {1}{m} \sum^m_{t=1}|x_{n+t} - \hat x_{n+t}|
    • MA Percentage E = \frac {1}{m} \sum^m_{t=1}|\frac{x_{n+t} - \hat x_{n+t}}{x_{n+t}}| x 100%
    • Root MSE

Box-Cox transofrmation

  • Use thid transformation to transofrm a non-normal time series into a near normal one
  • how
    • y = \large \frac {x^\lambda - 1}{\lambda} if \lambda ≠0
    • y = log(x) if \lambda = 0
    • \lambda is estimated by finding the value that maximizes the log-likelyhood for the transformed data
  • This is used to reduce heteroskedasticity
  • steps:
    • Hetero time series X can be transformed into homo series Y by choosing a value lambda
    • Fit ARIMA model on Y and get Y hat
    • Reverse the transform to get X hat

SARIMA model

  • SARIMA(p,d,q)(P,D,Q)s
    • seasonal AR(P):
    • ACF tails off at lags S, 2S, …
    • PACF cuts off after lag PS
    • seasonal MA(Q):
    • ACF cuts off after lag QS
    • PACF tails off at lags S, 2S, …
    • seasonal ARMA(P,Q):
    • ACF tails off at lags S, 2S, …
    • PACF tails off at lags S, 2S, …
    • Then p,q are estimated within the lag interval btwn 1 and S

Hybrid SARIMA-regression model

  • Combine SARIMA with external variables by regressing over it and in some cases it kicks fucking ass oh yeah
  • For time series X_t = \beta_0+\beta_1t+\beta_2W_{t-1}+W_t:
    • E[X_t] = \beta_0 +\beta_1t + \beta_2E[W_{t-1}]+ E[W_t]
                 = \beta_0 + \beta_1t
    • This value is dependent on t, so X_t is not stationary
  • Y_t = X_t-X_{t-1} = \beta_0+\beta_1t+\beta_2W_{t-1}+W_t - \beta_0-\beta_1(t-1)-\beta_2W_{t-2}-W_{t-1}           = \beta_1+W_t+(\beta_2-1)W_{t-1}-\beta_2W_{t-2}
  • By definition, Y_t is a MA(2) model. MA models are always stationary.

Tags:

Comments are closed

Latest Comments