A number of equations are provided for data simulation.
In addition, there are various time-series data available, such as:
Until 1982, when time series data and their analysis were published, most economists believed that all data time series were TS (i.e. after removing the trend, they became stationary). Nelson and Plosser proved that most of economics series were DS (i.e. their differences were stationary). Verify whether this is true on some sample datasets with real-world data.
Fourteen U.S. economic time series data from 1860 to 1970. See the documentation for the variable descriptions:
gnp.r
and cpi
data(nporg, package = "urca")
UK data frame of quarterly data ranging from 1955:Q1
until 1984:Q4
. The data is expressed in natural logarithms:
consl
- The log of total real consumption in the U.K.incl
- The log of real disposable income in the U.K.data(UKconinc, package = "urca")
UK data frame of quarterly data ranging from 1957:Q1
until 1975:Q4
:
cons
- Consumers non-durable expenditure in the U.K. in 1970 prices.inc
- Personal disposable income in the U.K. in 1970 prices.price
- Consumers expenditure deflator index, 1970 = 100
.data(UKconsumption, package = "urca")
internet <- stats::ts(c(88, 84, 85, 85, 84, 85, 83, 85, 88, 89, 91, 99,
104, 112, 126, 138, 146, 151, 150, 148, 147, 149, 143, 132, 131,
139, 147, 150, 148, 145, 140, 134, 131, 131, 129, 126, 126, 132,
137, 140, 142, 150, 159, 167, 170, 171, 172, 172, 174, 175, 172,
172, 174, 174, 169, 165, 156, 142, 131, 121, 112, 104, 102, 99,
99, 95, 88, 84, 84, 87, 89, 88, 85, 86, 89, 91, 91, 94, 101, 110,
121, 135, 145, 149, 156, 165, 171, 175, 177, 182, 193, 204, 208,
210, 215, 222, 228, 226, 222, 220), s = 1, f = 1)
chicken <- stats::ts(c(164.16, 169.17, 180.65, 168.30, 180.73, 192.55,
159.43, 150.11, 126.05, 106.08, 119.92, 157.06, 156.59, 161.21,
151.94, 137.47, 134.10, 153.25, 166.02, 203.24, 194.83, 208.18,
204.40, 171.61, 180.87, 154.12, 133.40, 139.22, 120.43, 119.53,
90.41, 100.48, 85.16, 70.41, 70.04, 54.59, 59.59, 48.84, 48.78,
47.25, 42.90, 40.80, 43.23, 34.23, 34.09, 38.27, 33.90, 27.48,
31.12, 49.16, 28.44, 26.60, 33.02, 29.34, 27.49, 27.67, 19.29,
17.65, 15.43, 18.43, 22.12, 19.88, 16.48, 14.00, 11.25, 17.38,
16.45, 15.69, 15.25, 14.64), s = 1924, f = 1)
Temp
(temperature) variable.airquality <- datasets::airquality
Note: It may very well be the case that some (or even all) of the data do not have unit roots. The idea is to carry out the unit root testing and model building procedures, as you would when working with any other empirical data.
Carry out a unit root test three ways:
dynlm
to estimate the relevant models for unit root testing. Don’t forget to write down the null hypothesis for the unit root test.ADF
, KPSS
and PP
tests. Write down the null hypothesis;Depending on the results, transform the series to induce stationarity. And examine its ACF and PACF plots. Select the appropriate model either manually, or using auto.arima
(remember the drawback of automated procedures - if needed restrict the maximum number of differences and seasonal/nonseasonal lag orders).
Write down the model equation for ΔYt and the equation for Yt (Note: you are free to use either dynlm
or Arima
to specify your model equation as long as it is the one you used in the previous tasks. You can also use the auto.arima documentation on its authors website for a more general model formula using the autocorrelation parameter lag functions.)
Calculate the 10-step ahead forecasts for the original series.
Carry out cross-validation for one-step-ahead forecasts by creating between 5 and 20 subsets by using the maximum possible k samples for your dataset: