Below we provide a general outline on building multivariate time series models based on the methods that were presented in the lectures along with some additional functions for a broader perspective:

**Variable selection & purpose**

The initial step is to understand the task at hand - what kind of variables should be included, what maximum lag order would make sense based on the frequency and description of the data. What is the underlying economic or financial theory that is assumed beforehand?**Data analysis and transformations**

After selecting the data, the second step would be examine the data plots and summary statistics and answer these questions:- Are there any outliers in the data?
- Is there any missing data?
- Do you need to transform the data, e.g. take
`logarithms`

? - Do you need to create any new variables - e.g. GDP per capita, number of children per household etc.?

**Unit root tests**

After determining the final set of variables, \(Y_t\), we need to test, whether they have a unit root (\(I(1)\), \(I(2)\), …) or are stationary (\(I(0)\)). To do this, use the`ADF`

test - the`ur.df()`

form the`urca`

package allows to do just this with`type = c("none", "drift", "trend")`

for different model types:

`none`

- no constant and no trend in the model,

`drift`

- constant in the model,

`trend`

- constant + trend in the model.

The model selection can also be automated by including

`lags = 10, selectlags = "BIC"`

to estimate different \(AR(p)\) models up to \(p = 10\) and select the best one in terms of their \(BIC\) value. Note that the null hypothesis of the \(ADF\) test is that the data*has a unit root*Other tests are also available and should be used as well:

- We can also use
`pp.test()`

from the`tseries`

package to carry out the`Phillips-Perron`

(PP) test for the*null hypothesis that the data has a unit root*. Note that the regression equation in this test incorporates a constant and a linear trend, so we cannot specify different model types like with the`ur.df()`

function.

- We can use the
`kpss.test()`

from the`tseries`

package to carry out the`Kwiatkowski-Phillips-Schmidt-Shin`

(KPSS) test for the*null hypothesis that the data is stationary*.

- There is also the
`adf.test()`

from the`tseries`

package which uses the regression equation with a constant and a linear trend for the`Augmented Dickey-Fuller`

(ADF) test for*the null hypothesis that the data has a unit root*. Again, we cannot specify different model types like in`ur.df()`

.

If we determine that the data

**is not**\(I(0)\), then we need to check if it is \(I(1)\). To do this, carry out the unit root tests on the**differences**of the data. If the differences do not have a unit root (i.e. \(\Delta Y_t\) is \(I(0)\)), then the series \(Y_t\) is \(I(1)\). If the differences do have a unit root, then we need to either differentiate the differences, and check if the series is \(I(2)\). If the series are \(I(2)\), then the interpretation of the data will change - e.g. \(\Delta \Delta Y_t\) could be interpreted as*differences of the change in*\(Y_t\). If the \(\Delta \Delta Y_t\) is still not \(I(0)\), then you should re-examine your data for any outliers or errors as higher order of differences will be harder to interpret.**VAR(p) model order selection**

Usually model selection is done based on some kind of information criteria - usually \(AIC\) or \(BIC\) sometimes other information criteria. This can be done using`VARselect()`

from the`vars`

package. \(BIC\) penalizes model complexity more heavily.**Cointegrating relationship test for unit root series**

If the time series has a unit root, we should check, whether there are any cointegrating relationships between the series. There are three types of tests, which can be performed:

The

`Engle-Granger procedure`

- it states that if**two**variables \(Y_t\) and \(X_t\) are**non-stationary**and**cointegrated**, then a linear combination of them must be**stationary**. In other words, the test is done in two steps:**Step 1**: estimate the long-run (equilibrium) equation: \[Y_t = \alpha + \gamma \cdot t + \beta X_t + u_t\] In order to test for cointegration, we need to test if \(\widehat{u}_t\) is stationary. This is determined via`ADF`

(`ur.df()`

) tests on the model residuals (the residuals have a zero-mean, so we exclude the constant term): \[\Delta \widehat{u}_t = \rho \widehat{u}_{t-1} + \sum_{i = 1}^p c_i \Delta \widehat{u}_{t-i} + w_t\] using critical values adjusted for the number of variables. The null hypothesis of`ADF`

test is \(H_0: \rho = 0\), i.e. the residuals have a unit root. If we reject the null hypothesis, then the residuals are stationary and we conclude that \(Y_t\) and \(X_t\)**are cointegrated**. Note that in this step the only important question is the stationarity or otherwise of the residuals, i.e. traditional diagnostic tests are unimportant.**Step 2**: in the second step, the lagged residuals are included in the regression to create a \(ECM\): \[\Delta Y_t = \phi + \delta t + \lambda \hat{u}_{t-1} + \gamma_1 \Delta Y_{t-1} + ... + \gamma_p \Delta Y_{t-p} + \omega_0 \Delta X_t + ... + \omega_q \Delta X_{t-q} + \epsilon_t\] The coefficient \(\lambda\)**must be negative**. (The same applies to the \(\Delta X_t\) equation).

The

`Phillips-Ouliaris cointegration test`

- tests the null hypothesis \(H_0: Y_t \text{ and } X_t \text{ are not cointegrated}\). this test can be performed via`ca.po()`

from the`urca`

package, or`po.test()`

from the`tseries`

package. Phillips and Ouliaris showed that the ADF and PP unit root tests applied to the estimated cointegrating residual do not have the usual DF distributions under the null \(H_0: \text{ no cointegration}\). Note that the test type`type = "Pz"`

, compared to the test`type = "Pu"`

, has the advantage that it is invariant to the normalization of the cointegration vector, i.e. it does not matter which variable is on the left hand side of the equation.

The

`Johansen test`

- a test for cointegration that allows for more than one cointegrating relationship, unlike the Engle-Granger method. If the sample size is too small then the results will not be reliable and one should use Auto Regressive Distributed Lags (ADL). The`ca.jo()`

function from the package`urca`

can be used to carry out this test, which is based on the following \(VAR\) and \(VECM\) specifications:

Given a general VAR model: \[\mathbf{Y}_t = \mathbf{\mu} + \Phi \mathbf{D}_t + \Theta_1 \mathbf{Y}_{t-1} + ... + \Theta_p \mathbf{Y}_{t-p} + \mathbf{\epsilon}_t,\quad t = 1,...,T\] where \(\mu\) is the constant and \(\mathbf{D}_t\) is the trend matrix. There are two possible specifications of a VECM.- The first form: \[\Delta \mathbf{Y}_t = \mathbf{\mu} + \Phi \mathbf{D}_t + \Gamma_1 \Delta \mathbf{Y}_{t-1} + ... + \Gamma_{p-1} \Delta \mathbf{Y}_{t-p+1} + \Pi \mathbf{Y}_{t-p} + \mathbf{\epsilon}_t\] where: \[
\begin{aligned}
\Gamma_i &= \sum_{j = 1}^i \Theta_j - \mathbf{I},\quad i = 1,...,p-1 \\
\Pi &= \sum_{k = 1}^p \Theta_j - \mathbf{I}
\end{aligned}
\] The \(\Gamma_i\) matrices contain the cumulative
*long-run*impacts. If the above \(VECM\) specification is required, then`spec = "longrun"`

must be specified when calling the`ca.jo()`

function. - The second form: \[\Delta \mathbf{Y}_t = \mathbf{\mu} + \Phi \mathbf{D}_t + \Gamma_1 \Delta \mathbf{Y}_{t-1} + ... + \Gamma_{p-1} \Delta \mathbf{Y}_{t-p+1} + \Pi \mathbf{Y}_{t-1} + \mathbf{\epsilon}_t\] where: \[
\begin{aligned}
\Gamma_i &= -\sum_{j = i+1}^p \Theta_j,\quad i = 1,...,p-1 \\
\Pi &= \sum_{k = 1}^p \Theta_j - \mathbf{I}
\end{aligned}
\] The \(\Pi\) matrix is the same as in the first specification. However, \(\Gamma_i\) matrices now measure the
*transitory*effects. By setting`spec = "transitory"`

when calling`ca.jo()`

, the VECM of the second specification form will be estimated.

Note that the results for \(\Pi\) will be the same, regardless of the chosen specification. The explanatory power will be the same as well.

`Johansen test`

has two variations, both of which are likelihood-ratio tests. For both test statistics, the null hypothesis if \[H_0:\text{ no cointegration}\] The tests differ in terms of the alternative hypothesis:**Maximum Eigenvalue Test**

The maximum eigenvalue test examines whether the largest eigenvalue is zero relative to the alternative that the next largest eigenvalue is zero. The outline of the test is as follows:- The first test is done for \[H_0: rank(\Pi) = 0\] with the alternative \[H_1: rank(\Pi) = 1\]. If the rank of the matrix is zero, the largest eigenvalue is zero and there is no cointegration - the test is done;
- Further tests (if we rejected \(H_0: rank(\Pi) = 0\)) check \[H_0: rank(\Pi) = r\] with \[H_1: rank(\Pi) = r+1\] for \(r = 1, 2, ...\). If at any point we do not reject \(H_0\) - the test is completed and the rank of \(\Pi\) (i.e. the number of cointegration relations) is equal to \(H_0: rank(\Pi) = r\).

- The first test is done for \[H_0: rank(\Pi) = 0\] with the alternative \[H_1: rank(\Pi) = 1\]. If the rank of the matrix is zero, the largest eigenvalue is zero and there is no cointegration - the test is done;
**Trace Test**

The trace test is a test whether the rank of the matrix \(\Pi\) is \(r\) - the null is \[H_0: rank(\Pi) = r\] the alternative is \[H_1: r < rank(\Pi) \leq K\] where \(K\) is the maximum possible cointegrating vectors (or the number of rows in \(\Pi\)). If we reject \(H_0\), then further test involves \[H_0: rank(\Pi) = r + 1\] with the alternative \[H_1: r + 1 < rank(\Pi) \leq K\] \(r = 0,...,K\) Note that, contrary to the name, this test is**not**based on the trace of \(\Pi\).

- The first form: \[\Delta \mathbf{Y}_t = \mathbf{\mu} + \Phi \mathbf{D}_t + \Gamma_1 \Delta \mathbf{Y}_{t-1} + ... + \Gamma_{p-1} \Delta \mathbf{Y}_{t-p+1} + \Pi \mathbf{Y}_{t-p} + \mathbf{\epsilon}_t\] where: \[
\begin{aligned}
\Gamma_i &= \sum_{j = 1}^i \Theta_j - \mathbf{I},\quad i = 1,...,p-1 \\
\Pi &= \sum_{k = 1}^p \Theta_j - \mathbf{I}
\end{aligned}
\] The \(\Gamma_i\) matrices contain the cumulative

**Estimating the model**

If the series is not cointegrated, we can estimate the model via \(VAR()\) function from package`vars`

for the differences of the series, \(\Delta Y_t\) (if a unit root is present). If the series are cointegrated, we need to consider the long-run relationship by estimating a \(VECM\) using either`VECM()`

from package`tsDyn`

, or`cajorls()`

from package`urca`

and specifying the number of cointegrating relations, which we found from the previous step. Depending on the function, we may also need to specify the`lag`

order of the \(VECM\) representation.**Model diagnostics tests**

Now that we have the model estimated, we need to verify if it is well specified. This is usually done by examining the residuals from the model. The most important test for time series data are tests for autocorrelations (or serial correlations) of the residuals, also known as`Portmanteu test`

. Two most well-known versions of this test are the`Ljung-Box`

and the`Box-Pierce`

tests, which are implemented in the`Box.test()`

function from the`stats`

package. For multivariate time series alongside autocorrelation, another problem is the**cross-correlation**of the residuals, i.e., when \(cor(\epsilon_{1,t}, \epsilon_{2, t + s}) \neq 0,\quad s > 0\). For this reason, we may use the`serial.test()`

function from the`vars`

package, which computes the multivariate test for serially correlated errors. A multivariate Ljung-Box test is implemented in the`mq()`

function from the`MTS`

package.**Results & conclusions**

After we verify that the model is adequate, we can either`predict()`

future values, or examine the impulse-response functions via`ifr()`

from the`vars`

package in order to check how the variables respond to a particular shock.

It is also strongly recommended to partition the dataset into`training`

and`test`

data subsets and evaluate the model on the`training`

data. Then we can compare the forecasted values with the`test`

data to measure the forecasting strength of our model by calculating the*forecast error*, like the \(MAPE\) or \(RMSE\). The size of the`test`

dataset is usually up to`20%`

of the total sample. For time series data, the`test`

data can be the**last**`20%`

of the time series observations. We can also repeat this process by estimating the model on the first`T - j`

time series and evaluating the forecasts \(h\) periods ahead, with different values of \(j = T - k, T - k - 1, ..., h\), where \(T-k\) is around`20%`

of the sample. We can then evaluate how our forecast error changes as we increase the sample size.

Based on this answer at stackoverflow.com

We will generate the following VAR(2) model:

\[ \begin{pmatrix} Y_{t} \\ X_{t} \end{pmatrix} = \begin{pmatrix} 3 \\ 2 \end{pmatrix} + \begin{pmatrix} 0.5 & 0.4 \\ -0.2 & 0.6 \end{pmatrix} \begin{pmatrix} Y_{t-1} \\ X_{t-1} \end{pmatrix} + \begin{pmatrix} -0.2 & -0.1 \\ 0.3 & 0.1 \end{pmatrix} \begin{pmatrix} Y_{t-2} \\ X_{t-2} \end{pmatrix} + \begin{pmatrix} \epsilon_{1,t} \\ \epsilon_{2,t} \end{pmatrix} ,\quad \begin{pmatrix} \epsilon_{1,t} \\ \epsilon_{2,t} \end{pmatrix} \sim \mathcal{N} \left( \begin{bmatrix} 0 \\ 0 \end{bmatrix} , \begin{bmatrix} 8 & 6 \\ 6 & 9 \end{bmatrix} \right) \]

```
set.seed(123)
nsample = 500
alpha = c(3, 2)
Theta_1 = matrix(c(0.5, 0.4, -0.2, 0.6), ncol = 2, byrow = TRUE)
Theta_2 = matrix(c(-0.2, -0.1, 0.3, 0.1), ncol = 2, byrow = TRUE)
Sigma = matrix(c(8, 6, 6, 9), ncol = 2, byrow = TRUE)
e = MASS::mvrnorm(n = nsample, mu = c(0, 0), Sigma = Sigma)
Y <- matrix(data = 0, nrow = nsample, ncol = 2)
colnames(Y) <- c("Y", "X")
Y[1, ] = alpha + e[1, ]
Y[2, ] = alpha + Theta_1 %*% Y[1, ] + e[2, ]
for(j in 3:nsample){
Y[j, ] = alpha + Theta_1 %*% Y[j-1, ] + Theta_2 %*% Y[j-2, ] + e[j, ]
}
plot.ts(Y, plot.type = "multiple")
```

We could test for a unit root any number of ways: either sequentially by creating `dynlm`

equations, using `ur.df`

, or `adf.test`

:

```
suppressPackageStartupMessages({
suppressWarnings({
library(urca)
})
})
adf_test_Y = ur.df(Y[, 1], type = "drift", lags = 5, selectlags = "BIC")
summary(adf_test_Y)
```

```
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression drift
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.928 -2.091 -0.126 1.894 9.443
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.14957 0.43437 9.553 < 2e-16 ***
## z.lag.1 -0.48234 0.04807 -10.035 < 2e-16 ***
## z.diff.lag1 0.23322 0.04467 5.221 2.64e-07 ***
## z.diff.lag2 -0.16735 0.04450 -3.761 0.00019 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.939 on 490 degrees of freedom
## Multiple R-squared: 0.3082, Adjusted R-squared: 0.304
## F-statistic: 72.78 on 3 and 490 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -10.0349 50.3495
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau2 -3.44 -2.87 -2.57
## phi1 6.47 4.61 3.79
```

```
adf_test_X = ur.df(Y[, 2], type = "drift", lags = 5, selectlags = "BIC")
summary(adf_test_X)
```

```
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression drift
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.3332 -1.8041 -0.1827 1.9078 9.5116
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.96144 0.45762 6.471 2.36e-10 ***
## z.lag.1 -0.30167 0.04459 -6.765 3.81e-11 ***
## z.diff.lag1 -0.30417 0.05038 -6.038 3.09e-09 ***
## z.diff.lag2 -0.16015 0.04448 -3.600 0.00035 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.951 on 490 degrees of freedom
## Multiple R-squared: 0.2824, Adjusted R-squared: 0.278
## F-statistic: 64.28 on 3 and 490 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -6.765 22.8827
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau2 -3.44 -2.87 -2.57
## phi1 6.47 4.61 3.79
```

Because the `t value`

of `z.lag.1`

is `-10.035 < -2.87`

- we reject the null hypothesis of a unit root in \(Y_t\). Since `-6.765 < -2.87`

- we reject the null hypothesis of a unit root in \(X_t\).

So, both \(Y_t\) and \(X_t\) are \(I(0)\).

We can automate the order selection via `VARselect`

using information criteria. Note that we can specify `type = c("const", "trend", "both", "none")`

if we want to include a constant or a trend (or both). Because we do not see any patterns of a trend, but we do see a variance around a non-zero mean, we select to include a constant:

```
suppressPackageStartupMessages({
suppressWarnings({
library(vars)
})
})
VARselect(Y, lag.max = 5, type = "const")
```

```
## $selection
## AIC(n) HQ(n) SC(n) FPE(n)
## 2 2 2 2
##
## $criteria
## 1 2 3 4 5
## AIC(n) 4.112833 3.572596 3.585883 3.596214 3.606383
## HQ(n) 4.132840 3.605941 3.632565 3.656235 3.679741
## SC(n) 4.163797 3.657537 3.704799 3.749107 3.793252
## FPE(n) 61.119624 35.608981 36.085331 36.460226 36.833118
```

We see that a \(VAR(2)\) model is recommended.

Regarding Johansens procedure - the matrix \(\Pi\) has full rank when the process is stationary. Johansen procedure requires that the time series are checked for unit roots first, since it was not designed to be used for unit root checks. We may still try to carry out the test.

Note that we can examine an automated procedure which chooses the “best” VEC model (that is, its `rank`

and `lag`

) according to the information criteria (this approach is **not** based on the Johansen test):

```
suppressPackageStartupMessages({
suppressWarnings({
library(tsDyn)
})
})
#Suppress function warnings:
suppressWarnings({
rank.select(Y, lag.max = 10, include = "const")
})
```

```
## Best AIC: rank= 2 lag= 1
## Best BIC: rank= 2 lag= 1
## Best HQ : rank= 2 lag= 1
```

This suggests that the \(\Pi_{2 \times 2}\) matrix would be full rank, indicating that both of our time series are stationary. If we look at the `Johansen-Procedure`

- since we have a `lag = 2`

in VAR via `VARselect()`

and the suggested `lag = 1`

in VECM via `rank.select()`

, we choose `lag = 1`

for the VECM order:

```
suppressWarnings({
temp_mdl <- VECM(Y, include = "const", lag = 1, estim = "ML")
summary(rank.test(temp_mdl))
})
```

```
## r trace trace_pval trace_pval_T eigen eigen_pval
## 1 0 342.81505 < 0.001 < 0.001 277.60295 < 0.001
## 2 1 65.21211 < 0.001 < 0.001 65.21211 < 0.001
```

Note that once we specify the \(VECM\) model itself, we **cannot directly test for a full rank** as the VECM model itself assumes that the data has a unit root (and thus the \(\Pi\) matrix does not have the full rank). Thankfully the output provides two tests:

`trace`

- tests the null hypothesis of \(rank = r\) against the alternative: \(r < rank \leq K\) (where`K`

is the number of rows in the \(\Pi\) matrix, i.e. the full rank), i.e. against the alternative that the system is stationary;`eigenvalue`

- tests the null hypothesis of \(rank = r\) against the alternative: \(rank = r + 1\).

We see that for \(r = 1\) the `trace_pval < 0.05`

, so we reject the null hypothesis of \(r = 1\) (and that \(r = 0\)), which means that the system is stationary (because the only other option is \(r = 2\) - the full rank). Furthermore, `eigen_pval < 0.05`

, so we reject the null hypothesis, that \(r = 1\), against our alternative, that \(r = 1 + 1 = 2\).

Similarly, with `ca.jo`

(note that `K`

from `ca.jo`

corresponds to `lag + 1`

in `rank.test`

). Because we specified a \(VECM(1)\) for our `rank.test`

, we set `K = 2`

:

```
temp_tst = ca.jo(Y, K = 2, type = "trace")
paste0(temp_tst@type, ", ", temp_tst@model)
```

`## [1] "trace statistic, with linear trend"`

`print(cbind(capture.output(summary(temp_tst))[11:16]))`

```
## [,1]
## [1,] "Values of teststatistic and critical values of test:"
## [2,] ""
## [3,] " test 10pct 5pct 1pct"
## [4,] "r <= 1 | 65.21 6.50 8.18 11.65"
## [5,] "r = 0 | 342.82 15.66 17.95 23.52"
## [6,] ""
```

```
temp_tst = ca.jo(Y, K = 2, type = "eigen")
paste0(temp_tst@type, ", ", temp_tst@model)
```

`## [1] "maximal eigenvalue statistic (lambda max), with linear trend"`

`print(cbind(capture.output(summary(temp_tst))[11:16]))`

```
## [,1]
## [1,] "Values of teststatistic and critical values of test:"
## [2,] ""
## [3,] " test 10pct 5pct 1pct"
## [4,] "r <= 1 | 65.21 6.50 8.18 11.65"
## [5,] "r = 0 | 277.60 12.91 14.90 19.19"
## [6,] ""
```

We see that the results are the same - we reject \(H_0: r = 0\) and \(H_0: r \leq 1\) in both `trace`

and `eigenvalue`

tests, so our cointegration matrix has the full rank, i.e. the system is stationary.

Now that we have established that both of our time series are \(I(0)\), we can estimate a VAR model. Based on `VARselect`

, we have chosen a \(VAR(2)\) model with a constant:

```
VAR_mdl = VAR(Y, p = 2, type = "const")
summary(VAR_mdl)
```

```
##
## VAR Estimation Results:
## =========================
## Endogenous variables: Y, X
## Deterministic variables: const
## Sample size: 498
## Log Likelihood: -2292.506
## Roots of the characteristic polynomial:
## 0.7214 0.4899 0.4899 0.07252
## Call:
## VAR(y = Y, p = 2, type = "const")
##
##
## Estimation results for equation Y:
## ==================================
## Y = Y.l1 + X.l1 + Y.l2 + X.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## Y.l1 0.51225 0.05000 10.246 < 2e-16 ***
## X.l1 0.34573 0.04846 7.135 3.48e-12 ***
## Y.l2 -0.22310 0.04790 -4.658 4.12e-06 ***
## X.l2 -0.10460 0.05046 -2.073 0.0387 *
## const 3.76166 0.42008 8.955 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 2.832 on 493 degrees of freedom
## Multiple R-Squared: 0.418, Adjusted R-squared: 0.4133
## F-statistic: 88.51 on 4 and 493 DF, p-value: < 2.2e-16
##
##
## Estimation results for equation X:
## ==================================
## X = Y.l1 + X.l1 + Y.l2 + X.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## Y.l1 -0.22639 0.05072 -4.463 1.00e-05 ***
## X.l1 0.54255 0.04916 11.037 < 2e-16 ***
## Y.l2 0.29039 0.04859 5.976 4.39e-09 ***
## X.l2 0.07986 0.05119 1.560 0.119
## const 3.17114 0.42616 7.441 4.47e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 2.873 on 493 degrees of freedom
## Multiple R-Squared: 0.3778, Adjusted R-squared: 0.3727
## F-statistic: 74.83 on 4 and 493 DF, p-value: < 2.2e-16
##
##
##
## Covariance matrix of residuals:
## Y X
## Y 8.023 5.602
## X 5.602 8.257
##
## Correlation matrix of residuals:
## Y X
## Y 1.0000 0.6883
## X 0.6883 1.0000
```

Now we can write the model:

\[ \begin{cases} Y_t &= 3.76166 + 0.51225 Y_{t-1} + 0.34573 X_{t-1} - 0.22310 Y_{t-2} - 0.10460 X_{t-2}\\ X_t &= 3.17114 - 0.22639 Y_{t-1} + 0.54255 X_{t-1} + 0.29039 Y_{t-2} + 0.07986 X_{t-2} \end{cases} \]

We could also re-write the system in a matrix form for easier comparison with the actual model (try it!). We see that the second equation is not as accurately estimated as the first one (note the insignificant \(X_{t-2}\) coefficient).

We can estimate a **restricted VAR**, either by manually imposing zero restrictions, or automatically, based on coefficient significance:

```
VAR_mdl_restricted = restrict(VAR_mdl, method = "ser", thresh = 2.0, resmat = NULL)
summary(VAR_mdl_restricted)
```

```
##
## VAR Estimation Results:
## =========================
## Endogenous variables: Y, X
## Deterministic variables: const
## Sample size: 498
## Log Likelihood: -2294.831
## Roots of the characteristic polynomial:
## 0.6746 0.5249 0.5249 0.1855
## Call:
## VAR(y = Y, p = 2, type = "const")
##
##
## Estimation results for equation Y:
## ==================================
## Y = Y.l1 + X.l1 + Y.l2 + X.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## Y.l1 0.51225 0.05000 10.246 < 2e-16 ***
## X.l1 0.34573 0.04846 7.135 3.48e-12 ***
## Y.l2 -0.22310 0.04790 -4.658 4.12e-06 ***
## X.l2 -0.10460 0.05046 -2.073 0.0387 *
## const 3.76166 0.42008 8.955 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 2.832 on 493 degrees of freedom
## Multiple R-Squared: 0.418, Adjusted R-squared: 0.4133
## F-statistic: 88.51 on 4 and 493 DF, p-value: < 2.2e-16
##
##
## Estimation results for equation X:
## ==================================
## X = Y.l1 + X.l1 + Y.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## Y.l1 -0.22379 0.05077 -4.408 1.28e-05 ***
## X.l1 0.57220 0.04540 12.603 < 2e-16 ***
## Y.l2 0.32956 0.04166 7.910 1.70e-14 ***
## const 3.30262 0.41835 7.894 1.90e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 2.878 on 494 degrees of freedom
## Multiple R-Squared: 0.9248, Adjusted R-squared: 0.9242
## F-statistic: 1520 on 4 and 494 DF, p-value: < 2.2e-16
##
##
##
## Covariance matrix of residuals:
## Y X
## Y 8.023 5.602
## X 5.602 8.297
##
## Correlation matrix of residuals:
## Y X
## Y 1.0000 0.6866
## X 0.6866 1.0000
```

We now see our restricted VAR model:

\[ \begin{cases} Y_t &= 3.76166 + 0.51225 Y_{t-1} + 0.34573 X_{t-1} - 0.22310 Y_{t-2} - 0.10460 X_{t-2}\\ X_t &= 3.30262 - 0.22379 Y_{t-1} + 0.57220 X_{t-1} + 0.32956 Y_{t-2} \end{cases} \]

We also note the estimated `Covariance matrix of residuals`

:

\[ \widehat{\Sigma}_\epsilon = \begin{pmatrix} 8.023 & 5.602 \\ 5.602 & 8.297 \end{pmatrix} \]

which is quite close to the true \(\Sigma_\epsilon\).

We can also look at the fitted values:

`plot(VAR_mdl_restricted, names = "Y")`

`plot(VAR_mdl_restricted, names = "X")`