Note: when generating data use np.random.seed(student_code) where student_code is your unique student code.

1. Model generation and initial analysis

Examine the following processes

1.1 What kind of models are specified in the equations - \(AR(p)\), \(MA(q)\) or \(ARMA(p,q)\) (do not forget to specify p and q)? Are the processes stationary and/or invertible? Explain your answers.

1.2 Simulate the data for each process with sample size \(T = 150\) and a \(WN\) component \(\epsilon_t \sim \mathcal{N}(0, 0.5^2)\). Assume that if \(t \leq 0\), then \(Y_t = \epsilon_t = 0\).

1.3 What is the theoretical mean of each process and is it close to the sample mean?

1.4 Plot the sample ACF and PACF - what can you say about the processes using only these plots?

2. Model estimation

2.1 Estimate the models from the generated data (note: use arima_model from the statsmodels.tsa library and specify the lag order from part 1.1). What are the coefficient values of your estimated models? Are they close to the actual values?

2.2 Use the arma_order_select_ic function (from statsmodels.tsa library) to fit the best model for each series. Is the model and the coefficients suggested by arma_order_select_ic the same as the ones used to generate the data?

2.3 If your models are stationary and/or invertable - re-estimate them as either pure MA, or pure AR models by either restricting the options in 2.2, or by selecting an arbitraty high lag order for the model.

3. Residual tests

Remember that the previous models are generated with shocks \(\epsilon_t \sim WN(0, \sigma^2)\).

3.1 Plot the residuals of your estimated models from the arma_order_select_ic. Does the time series plot look like \(WN\)?

3.2 Plot the sample ACF and PACF of your model residuals - do they look like WN?

3.3 Perform the Ljung-Box Test on the residuals of your models. Are the residuals WN?

4. Model Forecasts

4.1 Which model is better in terms of AIC: the ones from (2.1), or (2.2)?

4.2 Using (4.1) along with the results from the residual tests, select the best model and forecast 20 periods ahead. What can you say about the forecasts, i.e. how do the forecast values change as the forecast period increases?