General outline for Multivariate Time Series VAR model building

Below we provide a general outline on building multivariate time series models based on the methods that were presented in the lectures along with some additional functions for a broader perspective:

  1. Variable selection & purpose
    The initial step is to understand the task at hand - what kind of variables should be included, what maximum lag order would make sense based on the frequency and description of the data. What is the underlying economic or financial theory that is assumed beforehand?

  2. Data analysis and transformations
    After selecting the data, the second step would be examine the data plots and summary statistics and answer these questions:

    • Are there any outliers in the data?
    • Is there any missing data?
    • Do you need to transform the data, e.g. take logarithms?
    • Do you need to create any new variables - e.g. GDP per capita, number of children per household etc.?

  3. Unit root tests
    After determining the final set of variables, \(Y_t\), we need to test, whether they have a unit root (\(I(1)\), \(I(2)\), …) or are stationary (\(I(0)\)). To do this, use the ADF test - the ur.df() form the urca package allows to do just this with type = c("none", "drift", "trend") for different model types:

    • none - no constant and no trend in the model,

    • drift - constant in the model,

    • trend - constant + trend in the model.

    The model selection can also be automated by including lags = 10, selectlags = "BIC" to estimate different \(AR(p)\) models up to \(p = 10\) and select the best one in terms of their \(BIC\) value. Note that the null hypothesis of the \(ADF\) test is that the data has a unit root Other tests are also available and should be used as well:

    • We can also use pp.test() from the tseries package to carry out the Phillips-Perron (PP) test for the null hypothesis that the data has a unit root. Note that the regression equation in this test incorporates a constant and a linear trend, so we cannot specify different model types like with the ur.df() function.

    • We can use the kpss.test() from the tseries package to carry out the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test for the null hypothesis that the data is stationary.

    • There is also the adf.test() from the tseries package which uses the regression equation with a constant and a linear trend for the Augmented Dickey-Fuller (ADF) test for the null hypothesis that the data has a unit root. Again, we cannot specify different model types like in ur.df().

    If we determine that the data is not \(I(0)\), then we need to check if it is \(I(1)\). To do this, carry out the unit root tests on the differences of the data. If the differences do not have a unit root (i.e. \(\Delta Y_t\) is \(I(0)\)), then the series \(Y_t\) is \(I(1)\). If the differences do have a unit root, then we need to either differentiate the differences, and check if the series is \(I(2)\). If the series are \(I(2)\), then the interpretation of the data will change - e.g. \(\Delta \Delta Y_t\) could be interpreted as differences of the change in \(Y_t\). If the \(\Delta \Delta Y_t\) is still not \(I(0)\), then you should re-examine your data for any outliers or errors as higher order of differences will be harder to interpret.

  4. VAR(p) model order selection
    Usually model selection is done based on some kind of information criteria - usually \(AIC\) or \(BIC\) sometimes other information criteria. This can be done using VARselect() from the vars package. \(BIC\) penalizes model complexity more heavily.

  5. Cointegrating relationship test for unit root series
    If the time series has a unit root, we should check, whether there are any cointegrating relationships between the series. There are three types of tests, which can be performed:

    • The Engle-Granger procedure - it states that if two variables \(Y_t\) and \(X_t\) are non-stationary and cointegrated, then a linear combination of them must be stationary. In other words, the test is done in two steps:

      1. Step 1: estimate the long-run (equilibrium) equation: \[Y_t = \alpha + \gamma \cdot t + \beta X_t + u_t\] In order to test for cointegration, we need to test if \(\widehat{u}_t\) is stationary. This is determined via ADF (ur.df()) tests on the model residuals (the residuals have a zero-mean, so we exclude the constant term): \[\Delta \widehat{u}_t = \rho \widehat{u}_{t-1} + \sum_{i = 1}^p c_i \Delta \widehat{u}_{t-i} + w_t\] using critical values adjusted for the number of variables. The null hypothesis of ADF test is \(H_0: \rho = 0\), i.e. the residuals have a unit root. If we reject the null hypothesis, then the residuals are stationary and we conclude that \(Y_t\) and \(X_t\) are cointegrated. Note that in this step the only important question is the stationarity or otherwise of the residuals, i.e. traditional diagnostic tests are unimportant.

      2. Step 2: in the second step, the lagged residuals are included in the regression to create a \(ECM\): \[\Delta Y_t = \phi + \delta t + \lambda \hat{u}_{t-1} + \gamma_1 \Delta Y_{t-1} + ... + \gamma_p \Delta Y_{t-p} + \omega_0 \Delta X_t + ... + \omega_q \Delta X_{t-q} + \epsilon_t\] The coefficient \(\lambda\) must be negative. (The same applies to the \(\Delta X_t\) equation).

    • The Phillips-Ouliaris cointegration test - tests the null hypothesis \(H_0: Y_t \text{ and } X_t \text{ are not cointegrated}\). this test can be performed via ca.po() from the urca package, or po.test() from the tseries package. Phillips and Ouliaris showed that the ADF and PP unit root tests applied to the estimated cointegrating residual do not have the usual DF distributions under the null \(H_0: \text{ no cointegration}\). Note that the test type type = "Pz", compared to the test type = "Pu", has the advantage that it is invariant to the normalization of the cointegration vector, i.e. it does not matter which variable is on the left hand side of the equation.

    • The Johansen test - a test for cointegration that allows for more than one cointegrating relationship, unlike the Engle-Granger method. If the sample size is too small then the results will not be reliable and one should use Auto Regressive Distributed Lags (ADL). The ca.jo() function from the package urca can be used to carry out this test, which is based on the following \(VAR\) and \(VECM\) specifications:
      Given a general VAR model: \[\mathbf{Y}_t = \mathbf{\mu} + \Phi \mathbf{D}_t + \Theta_1 \mathbf{Y}_{t-1} + ... + \Theta_p \mathbf{Y}_{t-p} + \mathbf{\epsilon}_t,\quad t = 1,...,T\] where \(\mu\) is the constant and \(\mathbf{D}_t\) is the trend matrix. There are two possible specifications of a VECM.

      1. The first form: \[\Delta \mathbf{Y}_t = \mathbf{\mu} + \Phi \mathbf{D}_t + \Gamma_1 \Delta \mathbf{Y}_{t-1} + ... + \Gamma_{p-1} \Delta \mathbf{Y}_{t-p+1} + \Pi \mathbf{Y}_{t-p} + \mathbf{\epsilon}_t\] where: \[ \begin{aligned} \Gamma_i &= \sum_{j = 1}^i \Theta_j - \mathbf{I},\quad i = 1,...,p-1 \\ \Pi &= \sum_{k = 1}^p \Theta_j - \mathbf{I} \end{aligned} \] The \(\Gamma_i\) matrices contain the cumulative long-run impacts. If the above \(VECM\) specification is required, then spec = "longrun" must be specified when calling the ca.jo() function.
      2. The second form: \[\Delta \mathbf{Y}_t = \mathbf{\mu} + \Phi \mathbf{D}_t + \Gamma_1 \Delta \mathbf{Y}_{t-1} + ... + \Gamma_{p-1} \Delta \mathbf{Y}_{t-p+1} + \Pi \mathbf{Y}_{t-1} + \mathbf{\epsilon}_t\] where: \[ \begin{aligned} \Gamma_i &= -\sum_{j = i+1}^p \Theta_j,\quad i = 1,...,p-1 \\ \Pi &= \sum_{k = 1}^p \Theta_j - \mathbf{I} \end{aligned} \] The \(\Pi\) matrix is the same as in the first specification. However, \(\Gamma_i\) matrices now measure the transitory effects. By setting spec = "transitory" when calling ca.jo(), the VECM of the second specification form will be estimated.

      Note that the results for \(\Pi\) will be the same, regardless of the chosen specification. The explanatory power will be the same as well.

      Johansen test has two variations, both of which are likelihood-ratio tests. For both test statistics, the null hypothesis if \[H_0:\text{ no cointegration}\] The tests differ in terms of the alternative hypothesis:

      1. Maximum Eigenvalue Test
        The maximum eigenvalue test examines whether the largest eigenvalue is zero relative to the alternative that the next largest eigenvalue is zero. The outline of the test is as follows:
        • The first test is done for \[H_0: rank(\Pi) = 0\] with the alternative \[H_1: rank(\Pi) = 1\]. If the rank of the matrix is zero, the largest eigenvalue is zero and there is no cointegration - the test is done;
        • Further tests (if we rejected \(H_0: rank(\Pi) = 0\)) check \[H_0: rank(\Pi) = r\] with \[H_1: rank(\Pi) = r+1\] for \(r = 1, 2, ...\). If at any point we do not reject \(H_0\) - the test is completed and the rank of \(\Pi\) (i.e. the number of cointegration relations) is equal to \(H_0: rank(\Pi) = r\).
      2. Trace Test
        The trace test is a test whether the rank of the matrix \(\Pi\) is \(r\) - the null is \[H_0: rank(\Pi) = r\] the alternative is \[H_1: r < rank(\Pi) \leq K\] where \(K\) is the maximum possible cointegrating vectors (or the number of rows in \(\Pi\)). If we reject \(H_0\), then further test involves \[H_0: rank(\Pi) = r + 1\] with the alternative \[H_1: r + 1 < rank(\Pi) \leq K\] \(r = 0,...,K\) Note that, contrary to the name, this test is not based on the trace of \(\Pi\).
  6. Estimating the model
    If the series is not cointegrated, we can estimate the model via \(VAR()\) function from package vars for the differences of the series, \(\Delta Y_t\) (if a unit root is present). If the series are cointegrated, we need to consider the long-run relationship by estimating a \(VECM\) using either VECM() from package tsDyn, or cajorls() from package urca and specifying the number of cointegrating relations, which we found from the previous step. Depending on the function, we may also need to specify the lag order of the \(VECM\) representation.

  7. Model diagnostics tests
    Now that we have the model estimated, we need to verify if it is well specified. This is usually done by examining the residuals from the model. The most important test for time series data are tests for autocorrelations (or serial correlations) of the residuals, also known as Portmanteu test. Two most well-known versions of this test are the Ljung-Box and the Box-Pierce tests, which are implemented in the Box.test() function from the stats package. For multivariate time series alongside autocorrelation, another problem is the cross-correlation of the residuals, i.e., when \(cor(\epsilon_{1,t}, \epsilon_{2, t + s}) \neq 0,\quad s > 0\). For this reason, we may use the serial.test() function from the vars package, which computes the multivariate test for serially correlated errors. A multivariate Ljung-Box test is implemented in the mq() function from the MTS package.

  8. Results & conclusions
    After we verify that the model is adequate, we can either predict() future values, or examine the impulse-response functions via ifr() from the vars package in order to check how the variables respond to a particular shock.

    It is also strongly recommended to partition the dataset into training and test data subsets and evaluate the model on the training data. Then we can compare the forecasted values with the test data to measure the forecasting strength of our model by calculating the forecast error, like the \(MAPE\) or \(RMSE\). The size of the test dataset is usually up to 20% of the total sample. For time series data, the test data can be the last 20% of the time series observations. We can also repeat this process by estimating the model on the first T - j time series and evaluating the forecasts \(h\) periods ahead, with different values of \(j = T - k, T - k - 1, ..., h\), where \(T-k\) is around 20% of the sample. We can then evaluate how our forecast error changes as we increase the sample size.

Based on this answer at stackoverflow.com




VAR(2) - stationary example

We will generate the following VAR(2) model:

\[ \begin{pmatrix} Y_{t} \\ X_{t} \end{pmatrix} = \begin{pmatrix} 3 \\ 2 \end{pmatrix} + \begin{pmatrix} 0.5 & 0.4 \\ -0.2 & 0.6 \end{pmatrix} \begin{pmatrix} Y_{t-1} \\ X_{t-1} \end{pmatrix} + \begin{pmatrix} -0.2 & -0.1 \\ 0.3 & 0.1 \end{pmatrix} \begin{pmatrix} Y_{t-2} \\ X_{t-2} \end{pmatrix} + \begin{pmatrix} \epsilon_{1,t} \\ \epsilon_{2,t} \end{pmatrix} ,\quad \begin{pmatrix} \epsilon_{1,t} \\ \epsilon_{2,t} \end{pmatrix} \sim \mathcal{N} \left( \begin{bmatrix} 0 \\ 0 \end{bmatrix} , \begin{bmatrix} 8 & 6 \\ 6 & 9 \end{bmatrix} \right) \]

set.seed(123)

nsample = 500

alpha = c(3, 2)
Theta_1 = matrix(c(0.5, 0.4, -0.2, 0.6), ncol = 2, byrow = TRUE)
Theta_2 = matrix(c(-0.2, -0.1, 0.3, 0.1), ncol = 2, byrow = TRUE)

Sigma = matrix(c(8, 6, 6, 9), ncol = 2, byrow = TRUE)

e = MASS::mvrnorm(n = nsample, mu = c(0, 0), Sigma = Sigma)

Y <- matrix(data = 0, nrow = nsample, ncol = 2)
colnames(Y) <- c("Y", "X")

Y[1, ] = alpha + e[1, ]
Y[2, ] = alpha + Theta_1 %*% Y[1, ] + e[2, ]
for(j in 3:nsample){
  Y[j, ] = alpha + Theta_1 %*% Y[j-1, ] + Theta_2 %*% Y[j-2, ] + e[j, ]
}


plot.ts(Y, plot.type = "multiple")

Testing for unit root

We could test for a unit root any number of ways: either sequentially by creating dynlm equations, using ur.df, or adf.test:

suppressPackageStartupMessages({
  suppressWarnings({
    library(urca)
  })
})

adf_test_Y = ur.df(Y[, 1], type = "drift", lags = 5, selectlags = "BIC")
summary(adf_test_Y)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression drift 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -7.928 -2.091 -0.126  1.894  9.443 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  4.14957    0.43437   9.553  < 2e-16 ***
## z.lag.1     -0.48234    0.04807 -10.035  < 2e-16 ***
## z.diff.lag1  0.23322    0.04467   5.221 2.64e-07 ***
## z.diff.lag2 -0.16735    0.04450  -3.761  0.00019 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.939 on 490 degrees of freedom
## Multiple R-squared:  0.3082, Adjusted R-squared:  0.304 
## F-statistic: 72.78 on 3 and 490 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -10.0349 50.3495 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau2 -3.44 -2.87 -2.57
## phi1  6.47  4.61  3.79
adf_test_X = ur.df(Y[, 2], type = "drift", lags = 5, selectlags = "BIC")
summary(adf_test_X)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression drift 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.3332 -1.8041 -0.1827  1.9078  9.5116 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.96144    0.45762   6.471 2.36e-10 ***
## z.lag.1     -0.30167    0.04459  -6.765 3.81e-11 ***
## z.diff.lag1 -0.30417    0.05038  -6.038 3.09e-09 ***
## z.diff.lag2 -0.16015    0.04448  -3.600  0.00035 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.951 on 490 degrees of freedom
## Multiple R-squared:  0.2824, Adjusted R-squared:  0.278 
## F-statistic: 64.28 on 3 and 490 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -6.765 22.8827 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau2 -3.44 -2.87 -2.57
## phi1  6.47  4.61  3.79

Because the t value of z.lag.1 is -10.035 < -2.87 - we reject the null hypothesis of a unit root in \(Y_t\). Since -6.765 < -2.87 - we reject the null hypothesis of a unit root in \(X_t\).

So, both \(Y_t\) and \(X_t\) are \(I(0)\).

Selecting the VAR model order:

We can automate the order selection via VARselect using information criteria. Note that we can specify type = c("const", "trend", "both", "none") if we want to include a constant or a trend (or both). Because we do not see any patterns of a trend, but we do see a variance around a non-zero mean, we select to include a constant:

suppressPackageStartupMessages({
  suppressWarnings({
    library(vars)
  })
})

VARselect(Y, lag.max = 5, type = "const")
## $selection
## AIC(n)  HQ(n)  SC(n) FPE(n) 
##      2      2      2      2 
## 
## $criteria
##                1         2         3         4         5
## AIC(n)  4.112833  3.572596  3.585883  3.596214  3.606383
## HQ(n)   4.132840  3.605941  3.632565  3.656235  3.679741
## SC(n)   4.163797  3.657537  3.704799  3.749107  3.793252
## FPE(n) 61.119624 35.608981 36.085331 36.460226 36.833118

We see that a \(VAR(2)\) model is recommended.

Johansens procedure for stationary processes

Regarding Johansens procedure - the matrix \(\Pi\) has full rank when the process is stationary. Johansen procedure requires that the time series are checked for unit roots first, since it was not designed to be used for unit root checks. We may still try to carry out the test.

Note that we can examine an automated procedure which chooses the “best” VEC model (that is, its rank and lag) according to the information criteria (this approach is not based on the Johansen test):

suppressPackageStartupMessages({
  suppressWarnings({
    library(tsDyn)
  })
})
#Suppress function warnings:
suppressWarnings({
  rank.select(Y, lag.max = 10, include = "const")
})
## Best AIC: rank= 2  lag= 1 
## Best BIC: rank= 2  lag= 1 
## Best HQ : rank= 2  lag= 1

This suggests that the \(\Pi_{2 \times 2}\) matrix would be full rank, indicating that both of our time series are stationary. If we look at the Johansen-Procedure - since we have a lag = 2 in VAR via VARselect() and the suggested lag = 1 in VECM via rank.select(), we choose lag = 1 for the VECM order:

suppressWarnings({
  temp_mdl <- VECM(Y, include = "const", lag = 1, estim = "ML")
  summary(rank.test(temp_mdl))
})
##   r     trace trace_pval trace_pval_T     eigen eigen_pval
## 1 0 342.81505    < 0.001      < 0.001 277.60295    < 0.001
## 2 1  65.21211    < 0.001      < 0.001  65.21211    < 0.001

Note that once we specify the \(VECM\) model itself, we cannot directly test for a full rank as the VECM model itself assumes that the data has a unit root (and thus the \(\Pi\) matrix does not have the full rank). Thankfully the output provides two tests:

  • trace - tests the null hypothesis of \(rank = r\) against the alternative: \(r < rank \leq K\) (where K is the number of rows in the \(\Pi\) matrix, i.e. the full rank), i.e. against the alternative that the system is stationary;
  • eigenvalue - tests the null hypothesis of \(rank = r\) against the alternative: \(rank = r + 1\).

We see that for \(r = 1\) the trace_pval < 0.05, so we reject the null hypothesis of \(r = 1\) (and that \(r = 0\)), which means that the system is stationary (because the only other option is \(r = 2\) - the full rank). Furthermore, eigen_pval < 0.05, so we reject the null hypothesis, that \(r = 1\), against our alternative, that \(r = 1 + 1 = 2\).

Similarly, with ca.jo (note that K from ca.jo corresponds to lag + 1 in rank.test). Because we specified a \(VECM(1)\) for our rank.test, we set K = 2:

temp_tst = ca.jo(Y, K = 2, type = "trace")
paste0(temp_tst@type, ", ", temp_tst@model)
## [1] "trace statistic, with linear trend"
print(cbind(capture.output(summary(temp_tst))[11:16]))
##      [,1]                                                  
## [1,] "Values of teststatistic and critical values of test:"
## [2,] ""                                                    
## [3,] "           test 10pct  5pct  1pct"                   
## [4,] "r <= 1 |  65.21  6.50  8.18 11.65"                   
## [5,] "r = 0  | 342.82 15.66 17.95 23.52"                   
## [6,] ""
temp_tst = ca.jo(Y, K = 2, type = "eigen")
paste0(temp_tst@type, ", ", temp_tst@model)
## [1] "maximal eigenvalue statistic (lambda max), with linear trend"
print(cbind(capture.output(summary(temp_tst))[11:16]))
##      [,1]                                                  
## [1,] "Values of teststatistic and critical values of test:"
## [2,] ""                                                    
## [3,] "           test 10pct  5pct  1pct"                   
## [4,] "r <= 1 |  65.21  6.50  8.18 11.65"                   
## [5,] "r = 0  | 277.60 12.91 14.90 19.19"                   
## [6,] ""

We see that the results are the same - we reject \(H_0: r = 0\) and \(H_0: r \leq 1\) in both trace and eigenvalue tests, so our cointegration matrix has the full rank, i.e. the system is stationary.

VAR estimation and output

Now that we have established that both of our time series are \(I(0)\), we can estimate a VAR model. Based on VARselect, we have chosen a \(VAR(2)\) model with a constant:

VAR_mdl = VAR(Y, p = 2, type = "const")
summary(VAR_mdl)
## 
## VAR Estimation Results:
## ========================= 
## Endogenous variables: Y, X 
## Deterministic variables: const 
## Sample size: 498 
## Log Likelihood: -2292.506 
## Roots of the characteristic polynomial:
## 0.7214 0.4899 0.4899 0.07252
## Call:
## VAR(y = Y, p = 2, type = "const")
## 
## 
## Estimation results for equation Y: 
## ================================== 
## Y = Y.l1 + X.l1 + Y.l2 + X.l2 + const 
## 
##       Estimate Std. Error t value Pr(>|t|)    
## Y.l1   0.51225    0.05000  10.246  < 2e-16 ***
## X.l1   0.34573    0.04846   7.135 3.48e-12 ***
## Y.l2  -0.22310    0.04790  -4.658 4.12e-06 ***
## X.l2  -0.10460    0.05046  -2.073   0.0387 *  
## const  3.76166    0.42008   8.955  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 2.832 on 493 degrees of freedom
## Multiple R-Squared: 0.418,   Adjusted R-squared: 0.4133 
## F-statistic: 88.51 on 4 and 493 DF,  p-value: < 2.2e-16 
## 
## 
## Estimation results for equation X: 
## ================================== 
## X = Y.l1 + X.l1 + Y.l2 + X.l2 + const 
## 
##       Estimate Std. Error t value Pr(>|t|)    
## Y.l1  -0.22639    0.05072  -4.463 1.00e-05 ***
## X.l1   0.54255    0.04916  11.037  < 2e-16 ***
## Y.l2   0.29039    0.04859   5.976 4.39e-09 ***
## X.l2   0.07986    0.05119   1.560    0.119    
## const  3.17114    0.42616   7.441 4.47e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 2.873 on 493 degrees of freedom
## Multiple R-Squared: 0.3778,  Adjusted R-squared: 0.3727 
## F-statistic: 74.83 on 4 and 493 DF,  p-value: < 2.2e-16 
## 
## 
## 
## Covariance matrix of residuals:
##       Y     X
## Y 8.023 5.602
## X 5.602 8.257
## 
## Correlation matrix of residuals:
##        Y      X
## Y 1.0000 0.6883
## X 0.6883 1.0000

Now we can write the model:

\[ \begin{cases} Y_t &= 3.76166 + 0.51225 Y_{t-1} + 0.34573 X_{t-1} - 0.22310 Y_{t-2} - 0.10460 X_{t-2}\\ X_t &= 3.17114 - 0.22639 Y_{t-1} + 0.54255 X_{t-1} + 0.29039 Y_{t-2} + 0.07986 X_{t-2} \end{cases} \]

We could also re-write the system in a matrix form for easier comparison with the actual model (try it!). We see that the second equation is not as accurately estimated as the first one (note the insignificant \(X_{t-2}\) coefficient).

We can estimate a restricted VAR, either by manually imposing zero restrictions, or automatically, based on coefficient significance:

VAR_mdl_restricted = restrict(VAR_mdl, method = "ser", thresh = 2.0, resmat = NULL)
summary(VAR_mdl_restricted)
## 
## VAR Estimation Results:
## ========================= 
## Endogenous variables: Y, X 
## Deterministic variables: const 
## Sample size: 498 
## Log Likelihood: -2294.831 
## Roots of the characteristic polynomial:
## 0.6746 0.5249 0.5249 0.1855
## Call:
## VAR(y = Y, p = 2, type = "const")
## 
## 
## Estimation results for equation Y: 
## ================================== 
## Y = Y.l1 + X.l1 + Y.l2 + X.l2 + const 
## 
##       Estimate Std. Error t value Pr(>|t|)    
## Y.l1   0.51225    0.05000  10.246  < 2e-16 ***
## X.l1   0.34573    0.04846   7.135 3.48e-12 ***
## Y.l2  -0.22310    0.04790  -4.658 4.12e-06 ***
## X.l2  -0.10460    0.05046  -2.073   0.0387 *  
## const  3.76166    0.42008   8.955  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 2.832 on 493 degrees of freedom
## Multiple R-Squared: 0.418,   Adjusted R-squared: 0.4133 
## F-statistic: 88.51 on 4 and 493 DF,  p-value: < 2.2e-16 
## 
## 
## Estimation results for equation X: 
## ================================== 
## X = Y.l1 + X.l1 + Y.l2 + const 
## 
##       Estimate Std. Error t value Pr(>|t|)    
## Y.l1  -0.22379    0.05077  -4.408 1.28e-05 ***
## X.l1   0.57220    0.04540  12.603  < 2e-16 ***
## Y.l2   0.32956    0.04166   7.910 1.70e-14 ***
## const  3.30262    0.41835   7.894 1.90e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 2.878 on 494 degrees of freedom
## Multiple R-Squared: 0.9248,  Adjusted R-squared: 0.9242 
## F-statistic:  1520 on 4 and 494 DF,  p-value: < 2.2e-16 
## 
## 
## 
## Covariance matrix of residuals:
##       Y     X
## Y 8.023 5.602
## X 5.602 8.297
## 
## Correlation matrix of residuals:
##        Y      X
## Y 1.0000 0.6866
## X 0.6866 1.0000

We now see our restricted VAR model:

\[ \begin{cases} Y_t &= 3.76166 + 0.51225 Y_{t-1} + 0.34573 X_{t-1} - 0.22310 Y_{t-2} - 0.10460 X_{t-2}\\ X_t &= 3.30262 - 0.22379 Y_{t-1} + 0.57220 X_{t-1} + 0.32956 Y_{t-2} \end{cases} \]

We also note the estimated Covariance matrix of residuals:

\[ \widehat{\Sigma}_\epsilon = \begin{pmatrix} 8.023 & 5.602 \\ 5.602 & 8.297 \end{pmatrix} \]

which is quite close to the true \(\Sigma_\epsilon\).

VAR fit plots

We can also look at the fitted values:

plot(VAR_mdl_restricted, names = "Y")

plot(VAR_mdl_restricted, names = "X")