Consider the following IS-LM model:
\[\begin{aligned} \begin{cases} R_t = \beta_{11} + \beta_{12} M_t + \beta_{13} Y_t + \beta_{14} M_{t-1} + u_{1t}\\ Y_t = \beta_{21} + \beta_{22} R_t + \beta_{23} I_t + u_{2t} \end{cases} \end{aligned} \]
Where:
The example data are annual time series from 1969 to 1977 for the UK economy:
suppressPackageStartupMessages({
library(readxl)
require(systemfit)
require(AER)
})
txt1 <- "http://uosis.mif.vu.lt/~rlapinskas/(data%20R&GRETL/"
txt2 <- "simult.xls"
tmp = tempfile(fileext = ".xls")
download.file(url = paste0(txt1, txt2),
destfile = tmp, mode = "wb")
sim.dt <- data.frame(read_excel(path = tmp))
data1 <- ts(sim.dt[, -1], frequency = 1, start = 1969)
data1 <- data.frame(data1,
M.L1 = c(NA, data1[-nrow(data1), 2]))
We could estimate the individual equations via OLS (for the sake of comparativeness with other estimations, we will exclude the first row with NA
value):
R.ols <- lm(R ~ M + Y + M.L1, data = data1[-1, ])
Y.ols <- lm(Y ~ R + I, data = data1[-1, ])
round(summary(R.ols)$coefficients, 4)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 19.5581 10.1861 1.9201 0.0668
## M 0.0013 0.0018 0.6979 0.4920
## Y -0.1031 0.2048 -0.5035 0.6192
## M.L1 -0.0015 0.0017 -0.8406 0.4089
round(summary(Y.ols)$coefficients, 4)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 14.9134 8.0532 1.8519 0.0759
## R -0.1691 0.3165 -0.5342 0.5979
## I 0.0007 0.0001 11.8330 0.0000
However, in this case, we would ignore the fact that there are endogenous variables in the right hand side of our equations. Because the endogenous variables correlate with the residuals, our OLS estimates are no longer BLUE. As such, we need to account for this and carry out a two-stage least square procedure.
The 2SLS procedure is carried out as follows:
R.2sls <- ivreg(R ~ M + Y + M.L1 | M + M.L1 + I, data = data1[-1, ])
Y.2sls <- ivreg(Y ~ R + I | M + M.L1 + I, data = data1[-1, ])
round(summary(R.2sls)$coefficients, 4)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 27.5275 11.1348 2.4722 0.0209
## M 0.0019 0.0019 1.0091 0.3230
## Y -0.2647 0.2241 -1.1809 0.2492
## M.L1 -0.0017 0.0018 -0.9855 0.3342
round(summary(Y.2sls)$coefficients, 4)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 98.7996 68.7067 1.4380 0.1628
## R -4.0430 3.1306 -1.2914 0.2084
## I 0.0002 0.0004 0.5329 0.5988
The coefficients of Y as well as M are insignificant, suggesting that the LM function is very flat. Overall, accounting for the endogeneity changed the coefficient values.
However, in this case the coefficients remained insignificant, which indicates that besides endogeneity, but other factors might also not be overlooked. One of which is the fact that the residuals (or, more intuitively, random [economic] shocks) are correlated between the equations. If that is the case, we need to estimate the whole equation system as a whole.
We will begin by estimating the whole equation system the same way as we did previously (as independent equations). The only difference is that we will estimate them both at once. (Note: systemfit
cannot deal with NA
values, so those rows need to be dropped)
eq.R <- R ~ 1 + M + Y + M.L1
eq.Y <- Y ~ 1 + R + I
eq.sys <- list(R = eq.R, Y = eq.Y)
instr <- ~ M + M.L1 + I
eq.sys.ols <- systemfit(eq.sys, method = "OLS", data = data1[-1, ])
eq.sys.2sls<- systemfit(eq.sys, method = "2SLS", inst = instr, data = data1[-1, ])
round(summary(eq.sys.ols)$coefficients, 4)
## Estimate Std. Error t value Pr(>|t|)
## R_(Intercept) 19.5581 10.1861 1.9201 0.0668
## R_M 0.0013 0.0018 0.6979 0.4920
## R_Y -0.1031 0.2048 -0.5035 0.6192
## R_M.L1 -0.0015 0.0017 -0.8406 0.4089
## Y_(Intercept) 14.9134 8.0532 1.8519 0.0759
## Y_R -0.1691 0.3165 -0.5342 0.5979
## Y_I 0.0007 0.0001 11.8330 0.0000
round(summary(eq.sys.2sls)$coefficients, 4)
## Estimate Std. Error t value Pr(>|t|)
## R_(Intercept) 27.5275 11.1348 2.4722 0.0209
## R_M 0.0019 0.0019 1.0091 0.3230
## R_Y -0.2647 0.2241 -1.1809 0.2492
## R_M.L1 -0.0017 0.0018 -0.9855 0.3342
## Y_(Intercept) 98.7996 68.7067 1.4380 0.1628
## Y_R -4.0430 3.1306 -1.2914 0.2084
## Y_I 0.0002 0.0004 0.5329 0.5988
The results are identical to the ones from lm
and ivreg
. The disadvantage is that if some data is missing and needs to be excluded from systemfit
, the estimated coefficients may sometimes differ from the ones in lm
and ivreg
, since the missing values are not needed to be excluded from the dataset.
eq.sys.sur <- systemfit(eq.sys, method = "SUR", data = data1[-1, ])
round(summary(eq.sys.sur)$coefficients, 4)
## Estimate Std. Error t value Pr(>|t|)
## R_(Intercept) 10.3175 9.8370 1.0488 0.3047
## R_M 0.0004 0.0017 0.2083 0.8368
## R_Y 0.0987 0.1968 0.5014 0.6206
## R_M.L1 -0.0010 0.0016 -0.6187 0.5419
## Y_(Intercept) 23.8657 7.8012 3.0592 0.0052
## Y_R -0.5928 0.3021 -1.9623 0.0610
## Y_I 0.0007 0.0001 11.1904 0.0000
While bow we are accounting for the correlation of shocks, we are not accounting for endogenous predictors in the right hand side of the equations.
eq.sys.3sls <- systemfit(eq.sys, method = "3SLS", inst = instr, data = data1[-1, ])
round(summary(eq.sys.3sls)$coefficients, 4)
## Estimate Std. Error t value Pr(>|t|)
## R_(Intercept) 18.7421 8.9367 2.0972 0.0467
## R_M -0.0003 0.0009 -0.3496 0.7297
## R_Y -0.0697 0.1689 -0.4130 0.6833
## R_M.L1 0.0001 0.0011 0.0572 0.9548
## Y_(Intercept) 98.7996 68.7067 1.4380 0.1628
## Y_R -4.0430 3.1306 -1.2914 0.2084
## Y_I 0.0002 0.0004 0.5329 0.5988
For easier comparison, we can combine all the model results (remember that singe-equation estimation can be specified with systemfit
as well):
cof <- cbind(coef(eq.sys.ols), coef(eq.sys.2sls),
coef(eq.sys.sur), coef(eq.sys.3sls))
colnames(cof)<-c("OLS","2SLS","SUR", "3SLS")
round(cof, 4)
## OLS 2SLS SUR 3SLS
## R_(Intercept) 19.5581 27.5275 10.3175 18.7421
## R_M 0.0013 0.0019 0.0004 -0.0003
## R_Y -0.1031 -0.2647 0.0987 -0.0697
## R_M.L1 -0.0015 -0.0017 -0.0010 0.0001
## Y_(Intercept) 14.9134 98.7996 23.8657 98.7996
## Y_R -0.1691 -4.0430 -0.5928 -4.0430
## Y_I 0.0007 0.0002 0.0007 0.0002
We can see that, once we incorporate the fact that we have an equation system where the shocks could be correlated across equations, the estimated values do differ. If we also address the endogeneity problem, the coefficients change even more. Some of the most notable differences:
We can also compare the standard errors of the coefficients:
coef.se <- cbind(summary(eq.sys.ols)$coefficients[, 2], summary(eq.sys.2sls)$coefficients[, 2],
summary(eq.sys.sur)$coefficients[, 2], summary(eq.sys.3sls)$coefficients[, 2])
colnames(coef.se)<-c("OLS","2SLS","SUR", "3SLS")
round(coef.se, 4)
## OLS 2SLS SUR 3SLS
## R_(Intercept) 10.1861 11.1348 9.8370 8.9367
## R_M 0.0018 0.0019 0.0017 0.0009
## R_Y 0.2048 0.2241 0.1968 0.1689
## R_M.L1 0.0017 0.0018 0.0016 0.0011
## Y_(Intercept) 8.0532 68.7067 7.8012 68.7067
## Y_R 0.3165 3.1306 0.3021 3.1306
## Y_I 0.0001 0.0004 0.0001 0.0004
While for some predictors the standard errors decrease, there are some, like the \(R_t\) predictor in \(Y_t\) equation, which standard errors increase, although compared with the 2SLS
, it is not significant.
We note that the second equation is overidentified - we are using 3+1=4
instrumental (exogenous predictors + constant) variables for the 2nd equation, which has only 3 parameters - we could try to decrease the number of instruments, or find instruments that have a higher correlation with \(R_t\).