Note: when generating data use `np.random.seed(student_code)`

where `student_code`

is your unique student code.

Examine the following processes

- \(Y_t = 1 + (1 + 0.5L)\epsilon_t\)
- \(Y_t = 2 + (1 - 1.3L) \epsilon_t\)
- \(Y_t = 1 + (1 + 1.3L - 0.4 L^2)\epsilon_t\)
- \(Y_t = (1 + 0.4 L^2)\epsilon_t\)
- \((1 - 1.1L) Y_t = 1 + \epsilon_t\)
- \((1 + 0.5L) Y_t = 2 + \epsilon_t\)
- \((1 - 1.1L + 0.2L^2) Y_t = 1 + \epsilon_t\)
- \((1 - 0.2L^2) Y_t = 2 + \epsilon_t\)

- \((1 - 1.1L + 0.2L^2) Y_t = (1 + 1.3L)\epsilon_t\)
- \((1 - 0.5L) Y_t = 2 + (1 - 0.5L)\epsilon_t\)

1.1 What kind of models are specified in the equations - \(AR(p)\), \(MA(q)\) or \(ARMA(p,q)\) (do not forget to specify `p`

and `q`

)? Are the processes stationary and/or invertible? Explain your answers.

1.2 Simulate the data for each process with sample size \(T = 150\) and a \(WN\) component \(\epsilon_t \sim \mathcal{N}(0, 0.5^2)\). Assume that if \(t \leq 0\), then \(Y_t = \epsilon_t = 0\).

1.3 What is the theoretical mean of each process and is it close to the sample mean?

1.4 Plot the **sample** ACF and PACF - what can you say about the processes using only these plots?

2.1 Estimate the models from the generated data (note: use `arima_model`

from the `statsmodels.tsa`

library and specify the lag order from part `1.1`

). What are the coefficient values of your estimated models? Are they close to the actual values?

2.2 Use the `arma_order_select_ic`

function (from `statsmodels.tsa`

library) to fit the best model for each series. Is the model and the coefficients suggested by `arma_order_select_ic`

the same as the ones used to generate the data?

2.3 If your models are stationary and/or invertable - re-estimate them as either pure MA, or pure AR models by either restricting the options in `2.2`

, or by selecting an arbitraty high lag order for the model.

Remember that the previous models are generated with shocks \(\epsilon_t \sim WN(0, \sigma^2)\).

3.1 Plot the **residuals** of your estimated models from the `arma_order_select_ic`

. Does the time series plot look like \(WN\)?

3.2 Plot the sample ACF and PACF of your model **residuals** - do they look like WN?

3.3 Perform the Ljung-Box Test on the residuals of your models. Are the residuals WN?

4.1 Which model is better in terms of AIC: the ones from (2.1), or (2.2)?

4.2 Using (4.1) along with the results from the residual tests, select the best model and forecast 20 periods ahead. What can you say about the forecasts, i.e. how do the forecast values change as the forecast period increases?