suppressPackageStartupMessages({
library(plm)
})
data("Gasoline", package = "plm")
head(Gasoline)
The dataset contains 18 country
observations from 1960 to 1978 on gasoline consumption (for a total of 342 observations). In the dataset:
country
- a factor with 18 levels;year
- the year;lgaspcar
- logarithm of motor gasoline consumption per car;lincomep
- logarithm of real per-capita income;lrpmg
- logarithm of real price of motor gasoline;lcarpcap
- logarithm of the stock of cars per capita.Carry out the following tasks:
Visually inspect the lgaspcar
data in each country. Is the (logarithm of) gas consumption per car similar throughout the countries or could there be some country-specific (fixed) effects, which could influence gas consumption? If so, what, in your opinion, could these (fixed) effects be?
Assume that you are tasked to create a model for the (log of) gasoline consumption using the data available in this dataset. Which exogenous variables would you include in your model and what would you expect their signs to be?
Create a pooled ols model for lgaspcar
using relevant exogenous variables. Then answer the following:
Part 2
?Taking into account the results from Part 3
and the overview of the data from Part 1
:
Part 3
?Let’s say that we believe that the variation across entities (people, cities, etc.) is assumed to be random and uncorrelated with the predictor (i.e. independent) variables included in the model. Estimate a Random Effects model and compare the predictor coefficients with the ones from POLS and FE.
We would prefer the RE estimator if we can be sure that the individual-specific effect really is an unrelated effect (see slide 8). Test whether the RE estimator is consistent compared to the FE estimator. Based on the test result and the result from Task 4
- which one model would you choose - POLS, FE, or RE ?
Plot the fitted data alongside your actual data for:
Task 6
);Visually inspect the data - does the best model fit the countries equally well?
Calculate the mean squared error for each country separately - which country has the largest MSE, and does it align with you conclusions from the plots?
Note that in order to forecast lgaspcar
your specified models require having forecasts of exogeneous variables, which we do not usually have. We can think of two quickest (but not necessarily the best) ways to do remedy this:
(NOTE: you can take 80% of the dataset and re-fit your previous panel data model. Then you can compare the exogeneous variable forecasts as well as the panel data model forecasts with the actual values.)
For each country and each variable - use auto.arima
to fit the model for each exogeneous variables (some, or all of lincomep
, lrpmg
, lcarpcap
), which you included in your model. Forecast each model \(h = 5\) periods ahead.
For each country, fit a VAR (or VECM) model on the included exogeneous variable vector (some, or all of lincomep
, lrpmg
, lcarpcap
)
Once we obtain the forecasts for the exogeneous variables, we can move on to forecast our variable of interest:
Use the forecasted exogeneous variables (either one, or both forecasting methods) to estimate a forecast of lgaspcar
in your panel data model.
Examine the forecasts - would you consider them adequate (take note the historical increase/decrease in the data and compare whether the forecasts make sense).
Something to think about: you panel data model forecasts will depend not only on the accuracy of the panel data model, but also on the accuracy of the exogeneous variable models.