Linear Processes
Introduction to ARMA Processes
- ARMA $(1, 1)$ process
Properties of the Sample ACVF and Sample ACF
- Bartlett’s Formula
Forecast Stationary Time Series
- Best linear predictor: minimizes MSE
- Recursive methods: the Durbin-Levinson and Innovation Algorithms

For the pdf slides, click here

Best linear predictor

Goal: find a function of $X_{n}$ that gives the “best” predictor of $X_{n + h}$ .
- We mean “best” by achieving minimum mean squared error
- Under joint normality assumption of $X_{n}$ and $X_{n + h}$ , the best estimator is $m (X_{n}) = E (X_{n + h} ∣ X_{n}) = μ + ρ (h) (X_{n} - μ)$
Best linear predictor $ℓ (X_{n}) = a X_{n} + b$
- For Gaussian processes, $ℓ (X_{n})$ and $m (X_{n})$ are the same.
- The best linear predictor only depends on the mean and ACF of the series ${X_{n}}$

Properties of ACVF $γ (\cdot)$ and ACF $ρ (\cdot)$

$γ (0) \geq 0$
$| γ (h) | \leq γ (0)$ for all $h$
$γ (h)$ is a even function, i.e., $γ (h) = γ (- h)$ for all $h$
A function $κ : N \to R$ is nonnegative definite if $\sum_{i, j = 1}^{n} a_{i} κ (i - j) a_{j} \geq 0$ for all $n \in N^{+}$ and vectors $a = (a_{1}, \dots, a_{n})^{'} \in R^{n}$
Theorem: a real-value function defined on the integers is the autocovariance function of a stationary time series if and only if it is even and nonnegative definite
ACF $ρ (\cdot)$ has all above properties of ACVF $γ (\cdot)$
- Plus one more: $ρ (0) = 1$

MA $(q)$ process, $q$ -dependent, and $q$ -correlated

A time series ${X_{t}}$ is
- $q$ -dependent: if $X_{s}$ and $X_{t}$ are independent for all $| t - s | > q$ .
- $q$ -correlated: if $ρ (h) = 0$ for all $| h | > q$ .
Moving-average process of order $q$ : ${X_{t}}$ is a MA $(q)$ process if $X_{t} = Z_{t} + θ_{1} Z_{t - 1} + \dots + θ_{q} Z_{t - q}$ where ${Z_{t}} \sim WN (0, σ^{2})$
A MA $(q)$ process is $q$ -correlated
Theorem: a stationary $q$ -correlated time series with mean 0 can be represented as a MA $(q)$ process

Linear Processes

Linear processes: definitions

A time series ${X_{t}}$ is a linear process if $X_{t} = \sum_{j = - \infty}^{\infty} ψ_{j} Z_{t - j}$ where ${Z_{t}} \sim WN (0, σ^{2})$ , and the constants ${ψ_{j}}$ satisfy $\sum_{j = - \infty}^{\infty} | ψ_{j} | < \infty$
Equivalent representation using backward shift operator $B$ $X_{t} = ψ (B) Z_{t}, ψ (B) = \sum_{j = - \infty}^{\infty} ψ_{j} B^{j}$
Special case: moving average MA $(\infty)$ $X_{t} = \sum_{j = 0}^{\infty} ψ_{j} Z_{t - j}$

Linear processes: properties

In the linear process $X_{t} = \sum_{j = - \infty}^{\infty} ψ_{j} Z_{t - j}$ definition, the condition $\sum_{j = - \infty}^{\infty} | ψ_{j} | < \infty$ ensures
- The infinite sum $X_{t}$ converges with probability 1
- $\sum_{j = - \infty}^{\infty} ψ_{j}^{2} < \infty$ , and hence $X_{t}$ converges in mean square, i.e., $X_{t}$ is the mean square limit of the partial sum $\sum_{j = - n}^{n} ψ_{j} Z_{t - j}$

Apply a linear filter to a stationary time series, then the output series is also stationary

Theorem: let ${Y_{t}}$ be a stationary time series with mean 0 and ACVF $γ_{Y}$ . If $\sum_{j = - \infty}^{\infty} | ψ_{j} | < \infty$ , then the time series $X_{t} = \sum_{j = - \infty}^{\infty} ψ_{j} Y_{t - j} = ψ (B) Y_{t}$ is stationary with mean 0 and ACVF $γ_{X} (h) = \sum_{j = - \infty}^{\infty} \sum_{k = - \infty}^{\infty} ψ_{j} ψ_{k} γ_{Y} (h + k - j)$
Special case of the above result: If ${X_{t}}$ is a linear process, then its ACVF is $γ_{X} (h) = \sum_{j = - \infty}^{\infty} ψ_{j} ψ_{j + h} σ^{2}$

Combine multiple linear filters

Linear filters with absoluately summable coefficients $α (B) = \sum_{j = - \infty}^{\infty} α_{j} B^{j}, β (B) = \sum_{j = - \infty}^{\infty} β_{j} B^{j}$ can be applied successively to a stationary series ${Y_{t}}$ to generate a new stationary series $W_{t} = \sum_{j = - \infty}^{\infty} ψ_{j} Y_{t - j}, ψ_{j} = \sum_{k - \infty}^{\infty} α_{k} β_{j - k} = \sum_{k - \infty}^{\infty} β_{k} α_{j - k}$ or equivalently, $W_{t} = ψ (B) Y_{t}, ψ (B) = α (B) β (B) = β (B) α (B)$

AR $(1)$ process $X_{t} - ϕ X_{t - 1} = Z_{t}$ , in linear process formats

If $| ϕ | < 1$ , then $X_{t} = \sum_{j = 0}^{\infty} ϕ^{j} Z_{t - j}$
- Since $X_{t}$ only depends on ${Z_{s}, s \leq t}$ , we say ${X_{t}}$ is causal or future-independent
If $| ϕ | > 1$ , then $X_{t} = - \sum_{j = 1}^{\infty} ϕ^{- j} Z_{t + j}$
- This is because $X_{t} = - ϕ^{- 1} Z_{t + 1} + ϕ^{- 1} X_{t + 1}$
- Since $X_{t}$ depends on ${Z_{s}, s \geq t}$ , we say ${X_{t}}$ is noncausal
If $ϕ = \pm 1$ , then there is no stationary linear process solution

Introduction to ARMA Processes

ARMA $(1, 1)$ process

ARMA $(1, 1)$ process: definitions

The time series ${X_{t}}$ is a ARMA $(1, 1)$ process if it is stationary and satisfies $X_{t} - ϕ X_{t - 1} = Z_{t} + θ Z_{t - 1}$ where ${Z_{t}} \sim WN (0, σ^{2})$ and $ϕ + θ \neq 0$
Equivalent represention using the backward shift operator $ϕ (B) X_{t} = θ (B) Z_{t}, where ϕ (B) = 1 - ϕ B, θ (B) = 1 + θ B,$

ARMA $(1, 1)$ process in linear process format

If $ϕ \neq \pm 1$ , by letting $χ (z) = 1 / ϕ (z)$ , we can write an ARMA $(1, 1)$ as $X_{t} = χ (B) θ (B) Z_{t} = ψ (B) Z_{t}, where ψ (B) = \sum_{j = - \infty}^{\infty} ψ_{j} B^{j}$
- If $| ϕ | < 1$ , then $χ (z) = \sum_{j = 0}^{\infty} ϕ^{j} z^{j}$ , and $ψ_{j} = {\begin{cases} 0, & if j \leq - 1, \\ 1, & if j = 0, \\ (ϕ + θ) ϕ^{j - 1}, & if j \geq 1 \end{cases} Causal$
- If $| ϕ | > 1$ , then $χ (z) = - \sum_{j = - \infty}^{- 1} ϕ^{j} z^{j}$ , and $ψ_{j} = {\begin{cases} - (θ + ϕ) ϕ^{j - 1}, & if j \leq - 1, \\ - θ ϕ^{- 1}, & if j = 0, \\ 0, & if j \geq 1 \end{cases} Noncausal$
If $ϕ = \pm 1$ , then there is no such stationary ARMA $(1, 1)$ process

Invertibility

Invertibility is the dual concept of causaility
- Causal: $X_{t}$ can be expressed by ${Z_{s}, s \leq t}$
- Invertible: $Z_{t}$ can be expressed by ${X_{s}, s \leq t}$
For an ARMA $(1, 1)$ process,
- If $| θ | < 1$ , then it is invertible
- If $| θ | > 1$ , then it is noninvertible

Properties of the Sample ACVF and Sample ACF

Estimation of the series mean $μ = E (X_{t})$

The sample mean ${\bar{X}}_{n} = \frac{1}{n} \sum_{i = 1}^{n} X_{i}$ is an unbised estimator of $μ$
- Mean squared error $E ({\bar{X}}_{n} - μ)^{2} = \frac{1}{n} \sum_{h = - n}^{n} (1 - \frac{| h |}{n}) γ (h)$
Theorem: If ${X_{t}}$ is a stationary time series with mean 0 and ACVF $γ (\cdot)$ , then as $n \to \infty$ , $V ({\bar{X}}_{t}) = E ({\bar{X}}_{n} - μ)^{2} ⟶ 0, if γ (n) \to 0,$ $n E ({\bar{X}}_{n} - μ)^{2} ⟶ \sum_{| h | < \infty} γ (h), if \sum_{h = - \infty}^{\infty} | γ (h) | < \infty$

Confidence bounds of $μ$

If ${X_{t}}$ is Gaussian, then $\sqrt{n} ({\bar{X}}_{n} - μ) \sim N (0, \sum_{| h | < n} (1 - \frac{| h |}{n}) γ (h))$
For many common time series, such as linear and ARMA models, when $n$ is large, ${\bar{X}}_{n}$ is approximately normal: ${\bar{X}}_{n} \sim N (μ, \frac{v}{n}), v = \sum_{| h | < \infty} γ (h)$
- An approximate 95% confidence interval for $μ$ is $({\bar{X}}_{n} - 1.96 v^{1 / 2} / \sqrt{n}, {\bar{X}}_{n} + 1.96 v^{1 / 2} / \sqrt{n})$
- To estimate $v$ , we can use $\hat{v} = \sum_{| h | < \sqrt{n}} (1 - \frac{| h |}{\sqrt{n}}) \hat{γ} (h)$

Estimation of ACVF $γ (\cdot)$ and ACF $ρ (\cdot)$

Use sample ACVF $\hat{γ} (\cdot)$ and sample ACF $\hat{ρ} (\cdot)$ $\hat{γ} (h) = \frac{1}{n} \sum_{t = 1}^{n - | h |} (X_{t + | h |} - {\bar{X}}_{n}) (X_{t} - {\bar{X}}_{n}), \hat{ρ} (\cdot) = \hat{γ} (h) / \hat{γ} (0)$
- Even if the factor $1 / n$ is replaced by $1 / (n - h)$ , they are still biased
- They are nearly unbiased for large $n$
When $h$ is slightly smaller than $n$ , the estimators $\hat{γ} (\cdot), \hat{ρ} (\cdot)$ are unreliable since there are only few pairs of $(X_{t + h}, X_{t})$ .
- A useful guide for them to be reliable (by Jenkins): $n \geq 50, h \leq n / 4$

Bartlett’s Formula

Asymptotic distribution of $\hat{ρ} (\cdot)$

For linear models, esp ARMA models, when $n$ is large, ${\hat{ρ}}_{k} = (\hat{ρ} (1), \dots, \hat{ρ} (k))^{'}$ is approximately normal ${\hat{ρ}}_{k} \sim N (ρ, n^{- 1} W)$
By Bartlett’s formula, $W$ is the covariance matrix with entries $\begin{aligned} w_{i j} = \sum_{k = 1}^{\infty} & [ρ (k + i) + ρ (k - i) - 2 ρ (i) ρ (k)] \\ \times [ρ (k + j) + ρ (k - j) - 2 ρ (j) ρ (k)] \end{aligned}$
Special cases
- Marginally, for any $j \geq 1$ , $\hat{ρ} (j) \sim N (ρ (j), n^{- 1} w_{j j})$
- iid noise $w_{i j} = {\begin{cases} 1, & if i = j, \\ 0, & otherwise \end{cases} ⟺ \hat{ρ} (k) \sim N (0, 1 / n), k = 1, \dots, n$

Forecast Stationary Time Series

Best linear predictor: minimizes MSE

Best linear predictor: definition

For a stationary time series ${X_{t}}$ with known mean $μ$ and ACVF $γ$ , our goal is to find the linear combination of $1, X_{n}, X_{n - 1}, \dots, X_{1}$ that forecasts $X_{n + h}$ with minimum mean squared error
Best linear predictor: $P_{n} X_{n + h} = a_{0} + a_{1} X_{n} + \dots + a_{n} X_{1} = a_{0} + \sum_{i = 1}^{n} a_{i} X_{n + 1 - i}$
- We need to find the coefficients $a_{0}, a_{1}, \dots, a_{n}$ that minimize $E (X_{n + h} - a_{0} - a_{1} X_{n} - \dots - a_{n} X_{1})^{2}$
- We can take partial derivatives and solve a system of equations $\begin{aligned} E [X_{n + h} - a_{0} - \sum_{i = 1}^{n} a_{i} X_{n + 1 - i}] = 0, \\ E [(X_{n + h} - a_{0} - \sum_{i = 1}^{n} a_{i} X_{n + 1 - i}) X_{n + 1 - j}] = 0, j = 1, \dots, n \end{aligned}$

Best linear predictor: the solution

Plugging the solution $a_{0} = μ (1 - \sum_{i = 1}^{n} a_{i})$ in, the linear pedictor becomes $P_{n} X_{n + h} = μ + \sum_{i = 1}^{n} a_{i} (X_{n + 1 - i} - μ)$
The solution of coefficients $a_{n} = (a_{1}, \dots, a_{n})^{'} = Γ_{n}^{- 1} γ_{n} (h)$
- $Γ_{n} = {[γ (i - j)]}_{i, j = 1}^{n}$ and $γ_{n} (h) = {(γ (h), γ (h + 1), \dots, γ (h + n - 1))}^{'}$

Best linear predictor ${\hat{X}}_{t + h} = P_{n} X_{n + h}$ : properties

Unbiasness $E ({\hat{X}}_{t + h} - X_{t + h}) = 0$
Mean squared error (MSE) $\begin{aligned} E (X_{t + h} - {\hat{X}}_{t + h})^{2} & = E (X_{t + h}^{2}) - E ({\hat{X}}_{t + h}^{2}) \\ = γ (0) - a_{n}^{'} γ_{n} (h) \end{aligned}$
Orthogonality $E [({\hat{X}}_{t + h} - X_{t + h}) X_{j}] = 0, j = 1, \dots, n$
- In general, orthogonality means $E [(Error) \times (PredictorVariable)] = 0$

Example: one-step prediction of an AR $(1)$ series

We predict $X_{n + 1}$ from $X_{1}, \dots, X_{n}$ ${\hat{X}}_{n + 1} = μ + a_{1} (X_{n} - μ) + \dots a_{n} (X_{1} - μ)$
The coefficients $a_{n} = (a_{1}, \dots, a_{n})^{'}$ satisfies $[\begin{array}{ccccc} 1 & ϕ & ϕ^{2} & \dots & ϕ^{n - 1} \\ ϕ & 1 & ϕ & \dots & ϕ^{n - 2} \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ ϕ^{n - 1} & ϕ^{n - 2} & ϕ^{n - 3} & \dots & 1 \end{array}] [\begin{matrix} a_{1} \\ a_{2} \\ ⋮ \\ a_{n} \end{matrix}] = [\begin{matrix} ϕ_{1} \\ ϕ^{2} \\ ⋮ \\ ϕ^{n} \end{matrix}]$
By guessing, we find a solution $(a_{1}, a_{2}, \dots, a_{n}) = (ϕ, 0, \dots, 0)$ , i.e., ${\hat{X}}_{n + 1} = μ + ϕ (X_{n} - μ)$
- Does not depend on $X_{n - 1}, \dots, X_{1}$
- MSE $E (X_{t + 1} - {\hat{X}}_{t + 1})^{2} = σ^{2}$

WOLG, we can assume $μ = 0$ while predicting

A stationary time series ${X_{t}}$ has mean $μ$
To predict its future values, we can first create another time series $Y_{t} = X_{t} - μ$ and predict ${\hat{Y}}_{n + h} = P_{n} ({\hat{Y}}_{n + h})$ by ${\hat{Y}}_{n + h} = a_{1} Y_{n} + \dots + a_{n} Y_{1}$
Since ACVF $γ_{Y} (h) = γ_{X} (h)$ , the coefficients $a_{1}, \dots, a_{n}$ are the same for ${X_{t}}$ and ${Y_{t}}$
The best linear predictor for ${\hat{X}}_{n + h} = P_{n} ({\hat{X}}_{n + h})$ is ${\hat{X}}_{n + h} - μ = a_{1} (X_{n} - μ) + \dots + a_{n} (X_{1} - μ)$

Prediction operator $P (\cdot ∣ W)$

$X$ and $W_{1}, \dots, W_{n}$ are random variables with finte 2nd moments
- Note: $W_{1}, \dots, W_{n}$ does not need to be stationary
Best linear predictor: $\hat{X} = P (X ∣ W) = E (X) + a_{1} [W_{n} - E (W_{n})] + \dots + a_{n} [W_{1} - E (W_{1})]$
Coefficients $a = (a_{1}, \dots, a_{n})^{'}$ satisfies $Γ a = γ$ where $Γ = {[C o v (W_{n + 1 - i}, W_{n + 1 - j})]}_{i, j = 1}^{n}$ and $γ = {[C o v (X, W_{n}), \dots, C o v (X, W_{1})]}^{'}$

Properties of $\hat{X} = P (X ∣ W)$

Unbiased $E (\hat{X} - X) = 0$
Orthogonal $E [(\hat{X} - X) W_{i}] = 0$ for $i = 1, \dots n$
MSE $E (\hat{X} - X)^{2} = V a r (X) - (a_{1}, \dots, a_{n}) [\begin{matrix} C o v (X, W_{n}) \\ ⋮ \\ C o v (X, W_{1}) \end{matrix}]$
Linear $P (α_{1} X_{1} + α_{2} X_{2} + β ∣ W) = α_{1} P (X_{1} ∣ W) + α_{2} P (X_{2} ∣ W) + β$
Extreme cases
- Perfect prediction $P (\sum_{i = 1}^{n} α_{i} W_{j} + β ∣ W) = \sum_{i = 1}^{n} α_{i} W_{j} + β$
- Uncorrelated: if $C o v (X, W_{i}) = 0$ for all $i = 1, \dots, n$ , then $P (X ∣ W) = E (X)$

Examples: predictions of AR $(p)$ series

A time series ${X_{t}}$ is an autoregression of order $p$ , i.e., AR $(p)$ , if it is stationary and satisfies $X_{t} = ϕ_{1} X_{t - 1} + ϕ_{2} X_{t - 2} + \dots + ϕ_{p} X_{t - p} + Z_{t}$ where ${Z_{t}} \sim WN (0, σ^{2})$ , and $C o v (X_{s}, Z_{t}) = 0$ for all $s < t$
When $n > p$ , the one-step prediction of an AR $(p)$ series is $P_{n} X_{n + 1} = ϕ_{1} X_{n} + ϕ_{2} X_{n - 1} + \dots + ϕ_{p} X_{n + 1 - p}$ with MSE $E {(X_{n + 1} - P_{n} X_{n + 1})}^{2} = E (Z_{n + 1})^{2} = σ^{2}$
$h$ -step prediction of an AR $(1)$ series (proof by recursions) $P_{n} X_{n + h} = ϕ^{h} X_{n}, MSE = σ^{2} \frac{1 - ϕ^{2 h}}{1 - ϕ^{2}}$

Recursive methods: the Durbin-Levinson and Innovation Algorithms

Recursive methods for one-step prediction

The best linear predictor solution $a = Γ^{- 1} γ$ needs matrix inversion
Alternatively, we can use recursion to simplify one-step prediction of $P_{n} X_{n + 1}$ , based on $P_{j} X_{j + 1}$ for $j = 1, \dots, n - 1$
We will introduce
- Durbin-Levinson algorithms: good for AR $(p)$
- Innovation algorithm: good for MA $(q)$ ; innovations are uncorrelated

Durbin-Levinson algorithm

Assume ${X_{t}}$ is mean zero, stationary, with ACVF $γ (h)$ ${\hat{X}}_{n + 1} = ϕ_{n, 1} X_{n} + \dots ϕ_{n, n} X_{1}, with MSE v_{n} = E ({\hat{X}}_{n + 1} - X_{n + 1})^{2}$

Start with ${\hat{X}}_{1} = 0$ and $v_{0} = γ (0)$

For $n = 1, 2, \dots$ , compute step 2-4 successively

Compute $ϕ_{n, n}$ (partial autocorrelation function (PACF) at lag $n$ ) $ϕ_{n, n} = [γ (n) - \sum_{j = 1}^{n - 1} ϕ_{n - 1, j} γ (n - j)] / v_{n - 1}$
Compute $ϕ_{n, 1}, \dots, ϕ_{n, n - 1}$ $[\begin{matrix} ϕ_{n, 1} \\ ⋮ \\ ϕ_{n, n - 1} \end{matrix}] = [\begin{matrix} ϕ_{n - 1, 1} \\ ⋮ \\ ϕ_{n - 1, n - 1} \end{matrix}] - ϕ_{n, n} [\begin{matrix} ϕ_{n - 1, n - 1} \\ ⋮ \\ ϕ_{n - 1, 1} \end{matrix}]$
Compute $v_{n}$ $v_{n} = v_{n - 1} (1 - ϕ_{n, n}^{2})$

Innovation algorithm

Assume ${X_{t}}$ is any mean zero (not necessarily stationary) time series with covariance $κ (i, j) = C o v (X_{i}, X_{j})$
Predict ${\hat{X}}_{n + 1} = P_{n} X_{n + 1}$ based on innovations, or one-step prediction errors $X_{j} - {\hat{X}}_{j}$ , $j = 1, \dots, n$ ${\hat{X}}_{n + 1} = θ_{n, 1} (X_{n} - {\hat{X}}_{n}) + \dots + θ_{n, n} (X_{1} - {\hat{X}}_{1}) with MSE v_{n}$

Start with ${\hat{X}}_{1} = 0$ and $v_{0} = κ (1, 1)$

For $n = 1, 2, \dots$ , compute step 2-3 successively

For $k = 0, 1, \dots, n - 1$ , compute coefficients $θ_{n, n - k} = [κ (n + 1, k + 1) - \sum_{j = 0}^{k - 1} θ_{k, k - j} θ_{n, n - j} v_{j}] / v_{k}$
Compute the MSE $v_{n} = κ (n + 1, n + 1) - \sum_{j = 0}^{n - 1} θ_{n, n - j}^{2} v_{j}$

$h$ -step predictors using innovations

For any $k \geq 1$ , orthoganlity ensures $E [(X_{n + k} - P_{n + k - 1} X_{n + k}) X_{j}] = 0, j = 1, \dots, n$ Thus, we have $P_{n} (X_{n + k} - P_{n + k - 1} X_{n + k}) = 0$
The $h$ -step prediction: $\begin{aligned} P_{n} X_{n + h} & = P_{n} P_{n + h - 1} X_{n + h} \\ = P_{n} [\sum_{j = 1}^{n + h - 1} θ_{n + h - 1, j} (X_{n + h - j} - {\hat{X}}_{n + h - j})] \\ = \sum_{j = h}^{n + h - 1} θ_{n + h - 1, j} (X_{n + h - j} - {\hat{X}}_{n + h - j}) \end{aligned}$

References

Brockwell, Peter J. and Davis, Richard A. (2016), Introduction to Time Series and Forecasting, Third Edition. New York: Springer

Book Notes: Introduction to Time Series and Forecasting -- Ch2 Stationary Processes

Best linear predictor

Properties of ACVF γ(⋅)γ(⋅) and ACF ρ(⋅)ρ(⋅)

MA(q)(q) process, qq-dependent, and qq-correlated

Linear Processes

Linear processes: definitions

Linear processes: properties

Apply a linear filter to a stationary time series, then the output series is also stationary

Combine multiple linear filters

AR(1)(1) process Xt−ϕXt−1=ZtXt−ϕXt−1=Zt, in linear process formats

Introduction to ARMA Processes

ARMA(1,1)(1,1) process

ARMA(1,1)(1,1) process: definitions

ARMA(1,1)(1,1) process in linear process format

Invertibility

Properties of the Sample ACVF and Sample ACF

Estimation of the series mean μ=E(Xt)μ=E(Xt)

Confidence bounds of μμ

Estimation of ACVF γ(⋅)γ(⋅) and ACF ρ(⋅)ρ(⋅)

Bartlett’s Formula

Asymptotic distribution of ^ρ(⋅)ρ^(⋅)

Forecast Stationary Time Series

Best linear predictor: minimizes MSE

Best linear predictor: definition

Best linear predictor: the solution

Best linear predictor ^Xt+h=PnXn+hX^t+h=PnXn+h: properties

Example: one-step prediction of an AR(1)(1) series

WOLG, we can assume μ=0μ=0 while predicting

Prediction operator P(⋅∣W)P(⋅∣W)

Properties of ^X=P(X∣W)X^=P(X∣W)

Examples: predictions of AR(p)(p) series

Recursive methods: the Durbin-Levinson and Innovation Algorithms

Recursive methods for one-step prediction

Durbin-Levinson algorithm

Innovation algorithm

hh-step predictors using innovations

References

Properties of ACVF $γ (\cdot)$ and ACF $ρ (\cdot)$

MA $(q)$ process, $q$ -dependent, and $q$ -correlated

AR $(1)$ process $X_{t} - ϕ X_{t - 1} = Z_{t}$ , in linear process formats

ARMA $(1, 1)$ process

ARMA $(1, 1)$ process: definitions

ARMA $(1, 1)$ process in linear process format

Estimation of the series mean $μ = E (X_{t})$

Confidence bounds of $μ$

Estimation of ACVF $γ (\cdot)$ and ACF $ρ (\cdot)$

Asymptotic distribution of $\hat{ρ} (\cdot)$

Best linear predictor ${\hat{X}}_{t + h} = P_{n} X_{n + h}$ : properties

Example: one-step prediction of an AR $(1)$ series

WOLG, we can assume $μ = 0$ while predicting

Prediction operator $P (\cdot ∣ W)$

Properties of $\hat{X} = P (X ∣ W)$

Examples: predictions of AR $(p)$ series

$h$ -step predictors using innovations