Book Notes: Intro to Time Series and Forecasting -- Ch6 ARIMA Models

For the pdf slides, click here

When data is not stationary

  • Implication of not stationary: sample ACF or sample PACF do not rapidly decrease to zero as lag increases

  • What shall we do?
    • Differencing, then fit an ARMA ARIMA
    • Transformation, then fit an ARMA
    • Seasonal model SARIMA

A non-stationary exmaple: Dow Jones utilities index data

library(itsmr); ## Load the ITSM-R package
par(mfrow = c(1, 3));
plot.ts(dowj, main = 'Raw data'); 
acf(dowj); pacf(dowj);

After differencing

par(mfrow = c(1, 3));
dowj_diff = dowj[-length(dowj)] - dowj[-1];
plot.ts(dowj_diff, main = 'Data after differencing'); 
acf(dowj_diff); pacf(dowj_diff);

ARIMA Models

ARIMA model: definition

  • Autoregressive integrated moving-average models (ARIMA): Let dN, then series {Xt} is an ARIMA(p,d,q) process if Yt=(1B)dXt is a causal ARMA(p,q) process.

  • Difference equation (DE) for an ARIMA(p,d,q) process ϕ(B)Xt=ϕ(B)(1B)dXt=θ(B)Zt,{Zt}WN(0,σ2)
    • ϕ(z): polynomial of degree p, and ϕ(z)0 for |z|1
    • θ(z): polynomial of degree q
    • ϕ(z)=ϕ(z)(1z)d: has a zero of order d at z=1
  • An ARIMA process with d>0 is NOT stationary!

ARIMA mean is not dertermined by the DE

  • {Xt} is an ARIMA(p,d,q) process. We can add an arbitrary polynomial trend of degree d1 Wt=Xt+A0+A1t++Ad1td1 with A0,,Ad1 being any random variables, and {Wt} still satisfies the same ARIMA(p,d,q) difference equation

  • In other words, the ARIMA DE determines the second-order properties of {(1B)dXt} but not those of {Xt}

    • For parameter estimation: ϕ, θ, and σ2 are estimated based on {(1B)dXt} rather than {Xt}
    • For forecast, we need additional assumptions

Fit data using ARIMA processes

  • Whether to fit a finite time series using
    • non-stationary models (such as ARIMA), or
    • directly using stationary models (such as ARMA)?
  • If the fitted stationary ARMA model’s ϕ() have zeros very close to unit circles, then fitting an ARIMA model is better
    • Parameter estimation is stable
    • The differenced series may only need a low-order ARMA
  • Limitation of ARIMA: only permits data to be nonstationary in a very special way
    • Non-stationary: can have zeros anywhere on the unit circle |z|=1
    • ARIMA model: only has a zero of multiplicity d at the point z=1

Transformation and Identification Techniques

Natural log transformation

  • When data variance increases with mean, it’s common to apply log transformation before fitting the data using ARIMA or ARMA.

  • When does log transfomation work well? Suppose that E(Xt)=μt,Var(Xt)=σ2μt2 Then by first-order Taylor expansion of log(Xt) at μt: log(Xt)log(μt)+Xtμtμt  Var[log(Xt)]Var(Xt)μt2=σ2 The data after log transformation log(Xt) has a constant variance

  • Note: log transformation can only be applied to positive data
  • Note: If Yt=log(Xt), then because expectation and logarithm are not interchangeable, X^texp(Y^t)

Generalize the log transformation: Box-Cox transformation

  • Box-Cox transformation fλ(x)={xλ1λ,x0,λ>0log(x),x>0,λ=0

    • Usual range: 0λ1.5
    • Common values: λ=0,0.5
  • Note: limλ0fλ(x)=log(x)

  • Box-Cox transformation can only be applied to non-negative data

Unit Root Test

Unit root test for AR(1) process

  • {Xt} is an AR(1) process: Xtμ=ϕ1(Xt1μ)+Zt

  • Equivalent DE: Xt=XtXt1=ϕ0+ϕ1Xt1+Zt
    • where ϕ0=μ(1ϕ1) and ϕ1=ϕ11
    • Regressing Xt onto 1 and Xt1, we get the OLS estimator ϕ^1 and its standard error SE(ϕ^1)
  • Augmented Dickey-Fuller test for AR(1)
    • Hypotheses: H0:ϕ1=1  H1:ϕ1<1
    • Equivalent hypotheses: H0:ϕ1=0  H1:ϕ1<0
    • Test statistic: limit distribution under H0 is not normal or t τ^=ϕ^1SE(ϕ^1)
    • Rejection region: reject H0 if {τ^<2.57(90%)τ^<2.86(95%)τ^<3.43(99%)

Unit root test for AR(p) process

  • AR(p) process: Xtμ=ϕ1(Xt1μ)++ϕp(Xtpμ)+Zt

  • Equivalent DE: Xt=ϕ0+ϕ1Xt1+ϕ2Xt1++ϕpXtp+1+Zt

    • where ϕ0=μ(1i=1pϕi), ϕ1=i=1pϕi1, and ϕj=i=jpϕi for j2
    • Regressing Xt onto 1,Xt1,Xt1,,Xtp+1, we get the OLS estimator ϕ^1 and its standard error SE(ϕ^1)
  • Augmented Dickey-Fuller test for AR(p)
    • Hypotheses: H0:ϕ1=0  H1:ϕ1<0
    • Test statistic: τ^=ϕ^1SE(ϕ^1)
    • Rejection region: same as augmented Dickey-Fuller test for AR(1)

Implement augmented Dickey-Fuller test in R

library(tseries);
## Note: the lag k here is the AR order p
adf.test(dowj, k = 2);
## 
##  Augmented Dickey-Fuller Test
## 
## data:  dowj
## Dickey-Fuller = -1.3788, Lag order = 2, p-value = 0.8295
## alternative hypothesis: stationary

Forecast ARIMA models

Forecast an ARIMA(p,1,q) process

  • {Xt} is ARIMA(p,1,q), and {Yt=Xt} is a causal ARMA(p,q) Xt=X0+j=1tYj,t=1,2,

  • Best linear predictor of Xn+1 PnXn+1=Pn(X0+Y1++Yn+1)=Pn(Xn+Yn+1)=Xn+Pn(Yn+1),
    • Pn means based on {1,X0,X1,,Xn}, or equivalently, {1,X0,Y1,,Yn}
    • To find Pn(Yn+1), we need to know E(X02) and E(X0Yj), for j=1,,n+1.
  • A sufficient assumption for Pn(Yn+1) to be the best linear predictor in term of {Y1,,Yn}: X0 is uncorrelated with Y1,Y2,

Forecast an ARIMA(p,d,q) process

  • The observed ARIMA(p,d,q) process {Xt} satisfies Yt=(1B)dXt,t=1,2,,{Yt}causal ARMA(p,q)

  • Assumption: the random vector (X1d,,X0) is uncorrelated with Yt for all t>0

  • One-step predictors Y^n+1=PnYn+1 and X^n+1=PnXn+1: Xn+1X^n+1=Yn+1Y^n+1

  • Recall: the h-step predictor of ARMA(p,q) for n>max(p,q): PnYn+h=i=1pϕiPnYn+hi+j=hqθn+h1,j(Yn+hjY^n+hj)

  • h-step predictor of ARIMA(p,d,q) for n>max(p,q): PnXn+h=i=1p+dϕiPnXn+hi+j=hqθn+h1,j(Xn+hjX^n+hj) where ϕ(z)=(1z)dϕ(z)=1ϕ1zϕp+dzp+d

Seasonal ARIMA Models

Seasonal ARIMA (SARIMA) Model: definition

  • Suppose d,D are non-negative integers. {Xt} is a seasonal ARIMA(p,d,q) × (P,D,Q)s process with period s if the differenced series Yt=(1B)d(1Bs)DXt is a causal ARMA process defined by ϕ(B)Φ(Bs)Yt=θ(B)Θ(Bs)Zt,{Zt}WN(0,σ2) where ϕ(z)=1ϕ1zϕpzp,Φ(z)=1Φ1zΦPzPθ(z)=1+θz++θqzq,Θ(z)=1+Θz++ΘQzQ

  • {Yt} is causal if and only if neither ϕ(z) or Φ(z) has zeros inside the unit circle

  • Usually, s=12 for monthly data

Special case: seasonal ARMA (SARMA)

  • Between-year model: for monthly data, each one of the 12 time series is generated by the same ARMA(P,Q) model Φ(B12)Yt=Θ(B12)Ut,{Uj+12t,tZ}WN(0,σU2)

  • SARMA(P,Q) with period s: in the above between-year model, the period 12 can be changed to any positive integer s
    • If {Ut,tZ}WN(0,σU2), then the ACVF γ(h)=0 unless h divides s evenly. But this may not be ideal for real life application! E.g., this Feb is correlated with last Feb, but not this Jan.
  • General SARMA(p,q) × (P,Q) with period s: incorporate dependency between the 12 series by letting {Ut} be ARMA: Φ(Bs)Yt=Θ(Bs)Ut,ϕ(B)Ut=θ(B)Zt,{Zt}WN(0,σ2)
    • Equivalent DE for the general SARMA: ϕ(B)Φ(Bs)Yt=θ(B)Θ(Bs)Zt

Fit a SARIMA Model

  • Period s is known
  1. Find d and D to make the difference series {Yt} to look stationary

  2. Examine the sample ACF and sample PACF of {Yt} at lags being multiples of s, to find orders P,Q

  3. Use ρ^(1),,ρ^(s1) to find orders p,q

  4. Use AICC to decide among competing order choices

  5. Given orders (p,d,q,P,D,Q), estimate MLE of parameters (ϕ,θ,Φ,Θ,σ2)

Regression with ARMA Errors

Regression with ARMA errors: OLS estimation

  • Linear model with ARMA errors W=(W1,,Wn): Yt=xtβ+Wt,t=1,,n,{Wt}causal ARMA(p,q)
    • Note: each row is indexed by a different time t!
    • Error covariance Γn=E(WW)
  • Ordinary least squares (OLS) estimator β^OLS=(XX)1XY
    • Estimated by minimizing (YXβ)(YXβ)
    • OLS is unbiased, even when errors are dependent!

Regression with ARMA errors: GLS estimation

  • Generalized least squares (GLS) estimator β^GLS=(XΓn1X)1XΓn1Y
    • Estimated by minimizing the weighted sum of squares (YXβ)Γn1(YXβ)

    • Covariance Cov(β^GLS)=(XΓn1X)1

    • GLS is the best linear unbiased estimator, i.e., for any vector c and any unbiased estimator β^, we have Var(cβ^GLS)Var(cβ^)

When {Wt} is an AR(p) process

  • We can apply ϕ(B) to each side of the regression equation and get uncorrelated, zero-mean, constant-variance errors ϕ(B)Y=ϕ(B)Xβ+ϕ(B)W=ϕ(B)Xβ+Z

  • Under the transformed target variable Yt=ϕ(B)Yt,t=p+1,,n and the transformed design matrix Xt,j=ϕ(B)Xt,j,t=p+1,,n,j=1,,k the OLS estimator is the best linear unbiased estimator

  • Note: after the transformation, the regression sample size reduces to np

Regression with ARMA errors: MLE

  • MLE of β,ϕ,θ,σ2 can be estimated by maximizing the Gaussian likelihood with error covariance Γn

  • An iterative scheme

    1. Compute β^OLS and regression residuals YXβ^OLS

    2. Based on the estimated residuals, compute MLE of the ARMA(p,q) parameters

    3. Based on the fitted ARMA model, compute β^GLS

    4. Compute regression residuals YXβ^GLS, and return to Step 2 until estimators stabilize

  • Asymptotic properties of MLE: If {Wt} is a causal and invertible ARMA, then
    • MLEs are asymptotically normal
    • Estimated regression coefficients are asymptotically independent of estimated ARMA parameters

References