For the pdf slides, click here
When data is not stationary
Implication of not stationary: sample ACF or sample PACF do not rapidly decrease to zero as lag increases
- What shall we do?
- Differencing, then fit an ARMA ARIMA
- Transformation, then fit an ARMA
- Seasonal model SARIMA
A non-stationary exmaple: Dow Jones utilities index data
library(itsmr); ## Load the ITSM-R package
par(mfrow = c(1, 3));
plot.ts(dowj, main = 'Raw data');
acf(dowj); pacf(dowj);
After differencing
par(mfrow = c(1, 3));
dowj_diff = dowj[-length(dowj)] - dowj[-1];
plot.ts(dowj_diff, main = 'Data after differencing');
acf(dowj_diff); pacf(dowj_diff);
ARIMA Models
ARIMA model: definition
Autoregressive integrated moving-average models (ARIMA): Let , then series is an ARIMA process if is a causal ARMA process.
- Difference equation (DE) for an ARIMA process
- : polynomial of degree , and for
- : polynomial of degree
- : has a zero of order at
An ARIMA process with is NOT stationary!
ARIMA mean is not dertermined by the DE
is an ARIMA process. We can add an arbitrary polynomial trend of degree with being any random variables, and still satisfies the same ARIMA difference equation
In other words, the ARIMA DE determines the second-order properties of but not those of
- For parameter estimation: , , and are estimated based on rather than
- For forecast, we need additional assumptions
Fit data using ARIMA processes
- Whether to fit a finite time series using
- non-stationary models (such as ARIMA), or
- directly using stationary models (such as ARMA)?
- If the fitted stationary ARMA model’s have zeros very close
to unit circles, then fitting an ARIMA model is better
- Parameter estimation is stable
- The differenced series may only need a low-order ARMA
- Limitation of ARIMA: only permits data to be nonstationary in a
very special way
- Non-stationary: can have zeros anywhere on the unit circle
- ARIMA model: only has a zero of multiplicity at the point
Transformation and Identification Techniques
Natural log transformation
When data variance increases with mean, it’s common to apply log transformation before fitting the data using ARIMA or ARMA.
When does log transfomation work well? Suppose that Then by first-order Taylor expansion of at : The data after log transformation has a constant variance
- Note: log transformation can only be applied to positive data
Note: If , then because expectation and logarithm are not interchangeable,
Generalize the log transformation: Box-Cox transformation
Box-Cox transformation
- Usual range:
- Common values:
Note:
Box-Cox transformation can only be applied to non-negative data
Unit Root Test
Unit root test for AR process
is an AR process:
- Equivalent DE:
- where and
- Regressing onto and , we get the OLS estimator and its standard error
- Augmented Dickey-Fuller test for AR
- Hypotheses:
- Equivalent hypotheses:
- Test statistic: limit distribution under is not normal or t
- Rejection region: reject if
Unit root test for AR process
AR process:
Equivalent DE:
- where , , and for
- Regressing onto , we get the OLS estimator and its standard error
- Augmented Dickey-Fuller test for AR
- Hypotheses:
- Test statistic:
- Rejection region: same as augmented Dickey-Fuller test for AR
Implement augmented Dickey-Fuller test in R
library(tseries);
## Note: the lag k here is the AR order p
adf.test(dowj, k = 2);
##
## Augmented Dickey-Fuller Test
##
## data: dowj
## Dickey-Fuller = -1.3788, Lag order = 2, p-value = 0.8295
## alternative hypothesis: stationary
Forecast ARIMA models
Forecast an ARIMA process
is ARIMA, and is a causal ARMA
- Best linear predictor of
- means based on , or equivalently,
- To find , we need to know and , for .
A sufficient assumption for to be the best linear predictor in term of : is uncorrelated with
Forecast an ARIMA process
The observed ARIMA process satisfies
Assumption: the random vector is uncorrelated with for all
One-step predictors and :
Recall: the -step predictor of ARMA for :
-step predictor of ARIMA for : where
Seasonal ARIMA Models
Seasonal ARIMA (SARIMA) Model: definition
Suppose are non-negative integers. is a seasonal ARIMA process with period if the differenced series is a causal ARMA process defined by where
is causal if and only if neither or has zeros inside the unit circle
Usually, for monthly data
Special case: seasonal ARMA (SARMA)
Between-year model: for monthly data, each one of the 12 time series is generated by the same ARMA model
- SARMA with period :
in the above between-year model, the period 12
can be changed to any positive integer
- If , then the ACVF unless divides evenly. But this may not be ideal for real life application! E.g., this Feb is correlated with last Feb, but not this Jan.
- General SARMA with period :
incorporate dependency between the 12 series by letting be ARMA:
- Equivalent DE for the general SARMA:
Fit a SARIMA Model
- Period is known
Find and to make the difference series to look stationary
Examine the sample ACF and sample PACF of at lags being multiples of , to find orders
Use to find orders
Use AICC to decide among competing order choices
Given orders , estimate MLE of parameters
Regression with ARMA Errors
Regression with ARMA errors: OLS estimation
- Linear model with ARMA errors :
- Note: each row is indexed by a different time !
- Error covariance
- Ordinary least squares (OLS) estimator
- Estimated by minimizing
- OLS is unbiased, even when errors are dependent!
Regression with ARMA errors: GLS estimation
- Generalized least squares (GLS) estimator
Estimated by minimizing the weighted sum of squares
Covariance
GLS is the best linear unbiased estimator, i.e., for any vector and any unbiased estimator , we have
When is an AR process
We can apply to each side of the regression equation and get uncorrelated, zero-mean, constant-variance errors
Under the transformed target variable and the transformed design matrix the OLS estimator is the best linear unbiased estimator
Note: after the transformation, the regression sample size reduces to
Regression with ARMA errors: MLE
MLE of can be estimated by maximizing the Gaussian likelihood with error covariance
An iterative scheme
Compute and regression residuals
Based on the estimated residuals, compute MLE of the ARMA parameters
Based on the fitted ARMA model, compute
Compute regression residuals , and return to Step 2 until estimators stabilize
- Asymptotic properties of MLE:
If is a causal and invertible ARMA, then
- MLEs are asymptotically normal
- Estimated regression coefficients are asymptotically independent of estimated ARMA parameters
References
Brockwell, Peter J. and Davis, Richard A. (2016), Introduction to Time Series and Forecasting, Third Edition. New York: Springer
Weigt, George (2018) ITSM-R Reference Manual http://www.eigenmath.org/itsmr-refman.pdf
R package
tseries
https://cran.r-project.org/web/packages/tseries/index.html