Python code for Time Series Analysis

Please note that you should have completed the Basic Python Tutorial beforehand.

The first subsection of this document repeats the same steps for time series analysis as in the examples with R. The code below uses econometric/statistical methods developed with Statsmodels.

The second chapter deals with GARCH model specification and estimation.

Time Series Examples

The required packages are provided at the start of the document.

In [1]:
#Import the required modules for data generation
import numpy as np

#Import the required modules for plot creation:
import matplotlib.pyplot as plt

#Import the required modules for DataFrame creation:
import pandas as pd

#import the required modules for TimeSeries data generation:
import statsmodels.api as sm
#Import the required modules for test statistic calculation:
import statsmodels.stats as sm_stat
#Import the required modules for model estimation:
import statsmodels.tsa as smt
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\compat\pandas.py:56: FutureWarning: The pandas.core.datetools module is deprecated and will be removed in a future version. Please use the pandas.tseries module instead.
  from pandas.core import datetools

Generating a simple stationary process:

We can generate the stationary process:

  • $Y_t = \epsilon_t$, where $\epsilon_t \sim \mathcal{N}(0,1)$
In [2]:
#Set the seed:
np.random.seed(1)
nsample = 200
x1 = np.random.normal(loc = 0.0, scale = 1.0, size = nsample) 
#loc == mean; scale == sandard deviation

plt.plot(x1)
plt.title("WN, mean = 0")
plt.axhline(y = 0, color='r', linestyle='-')
plt.show()

By drawing the red horizontal line we can visually inspect whether the time series fluctuates around the mean.

Generating Non-Stationary processes:

We will generate the three examples of non-stationary time series:

  • $Y_t = t + \epsilon_t$, where $\epsilon_t \sim \mathcal{N}(0,1)$;
  • $Y_t = \epsilon_t \cdot t$, where $\epsilon_t \sim \mathcal{N}(0,1)$;
  • $Y_t = \sum_{j = 1}^t Z_j$, where each independent variable $Z_j$ is either $1$ or $-1$, with a 50% probability for either value.
In [3]:
np.random.seed(1)

n = 150

ns1 = np.random.normal(loc = 0.0, scale = 1.0, size = nsample) + np.arange(start = 1, stop = nsample + 1, step = 1)
ns2 = np.random.normal(loc = 0.0, scale = 1.0, size = nsample) * np.arange(start = 1, stop = nsample + 1, step = 1)
ns3 = np.cumsum(np.random.choice([-1, 1], size = nsample, replace = True))

#Note that for np.arrange() if we specify start = 1, end = n, then only (n-1) elements will be created

fig_size = [12, 5]
plt.rcParams["figure.figsize"] = fig_size

plt.subplot(2,1,1) # a 2-row, 1-column figure: go to the first subplot
plt.plot(ns1)
plt.title("non stationary in mean")

plt.subplot(2,2,3) # a 2-row, 2-column figure: go to the third subplot
plt.plot(ns2)
plt.title("non stationary in variance")

plt.subplot(2,2,4) # a 2-row, 2-column figure: go to the fourth subplot
plt.plot(ns3)
plt.title("no clear tendency")

#minimize the overlap of subplots (titles, axis labels etc.):
plt.tight_layout()
plt.show()

The three main cases of a non-stationary process: a non-constant mean, a changing variance. The last example is also non-stationary though it is not immediately apparent from a small sample size.

We can check how this kind of process would look like with different sample sizes (the vertical lines are there to distinguish where the sample intervals are in each plot):

In [4]:
np.random.seed(1)

#Generate the data
ns31 = np.cumsum(np.random.choice([-1, 1], size = 1000, replace = True))

#Plot different data samples:
fig_size = [12, 5]
plt.rcParams["figure.figsize"] = fig_size

plt.subplot(2,2,1)
plt.plot(ns31[0:99])
plt.axvline(x = 99, color='r', linestyle='-')
plt.subplot(2,2,2)
plt.plot(ns31[0:299])
plt.axvline(x = 99, color='r', linestyle='-')
plt.axvline(x = 299, color='g', linestyle='-')
plt.subplot(2,2,3)
plt.plot(ns31[0:599])
plt.axvline(x = 99, color='r', linestyle='-')
plt.axvline(x = 299, color='g', linestyle='-')
plt.axvline(x = 599, color='orange', linestyle='-')
plt.subplot(2,2,4)
plt.plot(ns31[0:999])
plt.axvline(x = 99, color='r', linestyle='-')
plt.axvline(x = 299, color='g', linestyle='-')
plt.axvline(x = 599, color='orange', linestyle='-')

plt.tight_layout()
plt.show()

We can see that, as we increase the sample size, the variance of the data increases.

Because we know that for this type of process $Var(Y_t) = t$, we can check this using a Monte Carlo simulation:

In [5]:
np.random.seed(1)
#A data.frame equivalent in Python:
df = pd.DataFrame()

for j in range(0, 1000):
    temp_data = pd.DataFrame(np.cumsum(np.random.choice([-1, 1], size = 11, replace = True)))
    temp_data.columns = ["X" + str(j+1)]
    #Bind the data column-wise:
    df = pd.concat([df.reset_index(drop=True), temp_data], axis = 1)
    
df.index = [name + str(number) for name in 'X' for number in range(1, 12)]
df.columns = ["Sample_" + str(number) for number in range(1, 1001)]

We can preview the created data.frame and calculate the variance in each time period:

In [6]:
print("Number of columns: ", len(df.columns))
print("Number of rows: ", len(df.index))
print("Variance:")
print(df.var(axis = 1))
Number of columns:  1000
Number of rows:  11
Variance:
X1      1.000805
X2      2.009994
X3      3.058158
X4      3.903580
X5      4.906791
X6      6.171868
X7      7.105630
X8      8.165281
X9      9.538815
X10    10.648649
X11    11.911287
dtype: float64

We can see that as t increases and we look at $X_1,X_2,...,X_{11}$, the variance also increases, which is in line with the theoretical calculations.

We can preview some of the data (note that we select the first five rows [ from index 0, to index 4] and the first 4 columns[from index 0 to index3]).

In [7]:
df.iloc [0:5, 0:4]
Out[7]:
Sample_1 Sample_2 Sample_3 Sample_4
X1 1 1 -1 1
X2 2 0 -2 2
X3 1 1 -1 3
X4 0 2 -2 4
X5 1 1 -3 3

Here are some of the sample series plotted with different colors:

In [8]:
df.iloc[:, [0, 4, 8, 10]].plot(subplots=False, legend=True)
plt.show()

And here is how all of the series look like on the same plot:

In [9]:
df.plot(subplots=False, legend=False, color='k')
plt.show()

We see that we get a plot that resembles a board. If we increase our simulation number from 1000 to more (5000, 10000, etc.) then our plot would have some additional directories, which were not present in the simulation.


ACF and PACF plots

Plot the sample ACF and PACF of the non-stationary time series:

In [10]:
#Convert the data to a pandas DataFrame:
ns1 = pd.DataFrame(ns1, index = range(1, len(ns1) + 1))
ns2 = pd.DataFrame(ns2, index = range(1, len(ns1) + 1))
ns3 = pd.DataFrame(ns3, index = range(1, len(ns1) + 1))

fig = plt.figure(figsize=(16,4))
fig = sm.graphics.tsa.plot_acf(ns1, lags=40, ax = fig.add_subplot(131))
fig.gca().set_ylim([-0.2, 1.1]) #Allows us to change the y axis limit on the plot
fig = sm.graphics.tsa.plot_acf(ns2, lags=40, ax = fig.add_subplot(132))
fig = sm.graphics.tsa.plot_acf(ns3, lags=40, ax = fig.add_subplot(133))
In [11]:
fig = plt.figure(figsize=(16,4))
fig = sm.graphics.tsa.plot_pacf(ns1, lags=40, ax = fig.add_subplot(131))
fig = sm.graphics.tsa.plot_pacf(ns2, lags=40, ax = fig.add_subplot(132))
fig = sm.graphics.tsa.plot_pacf(ns3, lags=40, ax = fig.add_subplot(133))
  • The confidence intervals for the ACF, the standard deviation is calculated using Bartlett’s formula.

  • The confidence intervals for the PACF, the standard deviation is calculated using 1/sqrt(len(x)).

Testing If a Time Series is White Noise

As an example, read the quarterly Canadian employment index data.

In [12]:
txt1 = "http://uosis.mif.vu.lt/~rlapinskas/(data%20R&GRETL/caemp.txt"
#Read the data from the url:
caemp = pd.read_csv(txt1)
#Add a time index:
caemp.index = pd.date_range(start = "1960-01", periods = len(caemp.index), freq = "M").to_period()
In [13]:
fig = plt.figure(figsize=(14,5))
caemp.plot(ax = fig.add_subplot(211), title = "Canadian Employment Data")
fig = sm.graphics.tsa.plot_acf(caemp, lags = 20, ax = fig.add_subplot(223))
fig = sm.graphics.tsa.plot_pacf(caemp, lags = 20, ax = fig.add_subplot(224))
plt.tight_layout()

We can test the WN hypothesis by running the Ljung-Box test statistic. Our null hypothesis:

$$H_0: \rho(1) = 0, ..., \rho(k) = 0$$

Let us test it for ACF up to lag 5, one by one:

In [14]:
sm_stat.diagnostic.acorr_ljungbox(caemp, lags = 5)
Out[14]:
(array([ 127.73122313,  240.45167522,  337.31925806,  418.2578357 ,
         484.15850152]),
 array([  1.28520642e-029,   6.11761503e-053,   8.30417003e-073,
          3.15464718e-089,   2.09558653e-102]))

The first array is the X-squared statistic. The second array is the associated p-values.

In [15]:
rez = pd.DataFrame(list(sm_stat.diagnostic.acorr_ljungbox(caemp, lags = 5)), index = ["X-squared", "p-value"], columns = range(1, 6))
rez
Out[15]:
1 2 3 4 5
X-squared 1.277312e+02 2.404517e+02 3.373193e+02 4.182578e+02 4.841585e+02
p-value 1.285206e-29 6.117615e-53 8.304170e-73 3.154647e-89 2.095587e-102

In all cases, we reject the null hypothesis that all of the autocorrelation functions up to k are zero.

Note that, if we reject the null hypothesis $H_0: \rho(1) = 0, ..., \rho(k) = 0$ - this means that at least one ACF is non-zero - this does not necessarily mean that all are non-zero.

An example of performing the Ljung-Box test on $\epsilon_t \sim \mathcal{N}(0,1)$:

In [16]:
np.random.seed(1)

#Correlation with the first lag:
rez = sm_stat.diagnostic.acorr_ljungbox(np.random.normal(loc = 0.0, scale = 1.0, size = 100) , lags = 1)
print(rez)
#Correlation with the 2nd lag only:
rez = sm_stat.diagnostic.acorr_ljungbox(np.random.normal(loc = 0.0, scale = 1.0, size = 100) , lags = [2])
print(rez)
(array([ 0.15775836]), array([ 0.69122878]))
(array([ 2.97010283]), array([ 0.2264907]))

p-value < 0.05 - we do not reject the null hypothesis $H_0: \rho(1) = 0$.


Moving-Average MA(q) Models

We will generate three MA models with $\epsilon_t \sim \mathcal{N}(0,1)$:

  • $Y_t = 3 + \epsilon_t + 0.5 \epsilon_{t-1}$;
  • $Y_t = \epsilon_t + 1.3 \epsilon_{t-1}$;
  • $Y_t = \epsilon_t - 1.3 \epsilon_{t-1} + 0.4 \epsilon_{t-2}$;

We can generate the processes in three ways: manually or using ArmaProcess from the statsmodels.tsa.arima_process module.

Generating manually involves writing the code equivalent of the previously provided mathematical formulas:

In [17]:
np.random.seed(100)

n = 5000
epsilon = np.random.normal(loc = 0.0, scale = 1.0, size = n)
MA_m1 = 3 + epsilon + 0.5 * np.append([0], epsilon[:-1])
MA_m2 = epsilon + 1.3 * np.append([0], epsilon[:-1])
MA_m3 = epsilon - 1.3 * np.append([0], epsilon[:-1]) + 0.4 * np.append([0, 0], epsilon[:-2])

The mean of each sample is:

In [18]:
MA_m1.mean()
Out[18]:
3.0129627116832554
In [19]:
MA_m2.mean()
Out[19]:
0.019810045532363273
In [20]:
MA_m3.mean()
Out[20]:
0.0010863737781601293

which is what we expect (Note: you can check this by calculating $\mathbb{E}Y_t$ for each case).

The variance of the second process:

In [21]:
MA_m2.var()
Out[21]:
2.8027618282371676

is very close to the theoretical value $Var(Y_t) = (1 + \theta^2)\sigma^2 = 1 + (1.3)^2 = 2.69$.

The ACF and PACF or the three different models:

In [22]:
fig = plt.figure(figsize=(14,5))

#ACF plots:
sm.graphics.tsa.plot_acf(MA_m1, lags = 40, ax = fig.add_subplot(231), title = '$ Y_{t} = 3 + \epsilon_t + 0.5 \epsilon_{t-1}$')
plt.ylabel('ACF')
sm.graphics.tsa.plot_acf(MA_m2, lags = 40, ax = fig.add_subplot(232), title = '$ Y_{t} = \epsilon_t + 1.3 \epsilon_{t-1}$')
plt.ylabel('ACF')
sm.graphics.tsa.plot_acf(MA_m3, lags = 40, ax = fig.add_subplot(233), title = '$ Y_{t} = \epsilon_t - 1.3 \epsilon_{t-1} + 0.4 \epsilon_{t-2}$')
plt.ylabel('ACF')

#PACF plots:
sm.graphics.tsa.plot_pacf(MA_m1, lags = 40, ax = fig.add_subplot(234), title = '$ Y_{t} = 3 + \epsilon_t + 0.5 \epsilon_{t-1}$')
plt.ylabel('PACF')
sm.graphics.tsa.plot_pacf(MA_m2, lags = 40, ax = fig.add_subplot(235), title = '$ Y_{t} = \epsilon_t + 1.3 \epsilon_{t-1}$')
plt.ylabel('PACF')
sm.graphics.tsa.plot_pacf(MA_m3, lags = 40, ax = fig.add_subplot(236), title = '$ Y_{t} = \epsilon_t - 1.3 \epsilon_{t-1} + 0.4 \epsilon_{t-2}$')
plt.ylabel('PACF')
plt.tight_layout()

For MA(1) the ACF has a sharp cutoff after lag = 1, for MA(2) - the sharp cutoff is after lag = 2. The PACF of the MA(q) process is slowly decaying to zero.

Note that the second equation does not meet the requirement that $|\theta| < 1$. Therefore, it is not invertible, i.e. it cannot be expressed as an infinite-order AR process.


Autoregressive AR(p) Models

We will generate three AR models with $\epsilon_t \sim \mathcal{N}(0,1)$:

  • $Y_t = 3 + 0.5 Y_{t-1} + \epsilon_t$;
  • $Y_t = 1.2 Y_{t-1} + \epsilon_t$;
  • $Y_t = 1.2 Y_{t-1} - 0.5 Y_{t-2} + \epsilon_t$;

We will also give an example of model generation using using ArmaProcess module:

In [23]:
np.random.seed(100)

n = 5000
epsilon = np.random.normal(loc = 0.0, scale = 1.0, size = n)

AR_m1 = np.array(3.0 + epsilon[0], ndmin = 1)
AR_m2 = np.array(epsilon[0], ndmin = 1)

for j in range(1, n):
    AR_m1 = np.append(AR_m1, 3 + 0.5 * AR_m1[j -1] + epsilon[j])
    AR_m2 = np.append(AR_m2, 1.2 * AR_m2[j - 1] + epsilon[j])

#Specify the model:
AR_m3_spec = smt.arima_process.ArmaProcess(ar = np.array([1, -1.2, 0.5]), ma = [1])
#Generate the data sample:
AR_m3 = AR_m3_spec.generate_sample(n)
C:\Users\Andrius\Anaconda3\lib\site-packages\ipykernel_launcher.py:11: RuntimeWarning: overflow encountered in double_scalars
  # This is added back by InteractiveShellApp.init_path()

For explanations on the coefficient inclusion in the ArmaProcess function, see the documentation and the second section for ARMA model generation.

We also get a RuntimeWarning about an overflowing range of possible values. These values are automatically marked as Inf:. For example, the last number in the second model:

In [24]:
AR_m2[-1]
Out[24]:
-inf

To make it easier to analyse the second model, we will reduce the sample of the generated data:

In [25]:
AR_m2 = AR_m2[0:100]
plt.plot(AR_m2)
plt.show()

Because the coefficient $\phi = 1.2 \geq 1$ the time series declines exponentially (the growth (from R examples) or decline also depends on the signs of the initial shocks $\epsilon_1, \epsilon_2,...$).

The mean of each sample is:

In [26]:
print(AR_m1.mean())
print(AR_m2.mean())
print(AR_m3.mean())
6.01614212604
-1093889.20244
-0.0213108930178

Remember that for the generalized AR(1) case (with constant component) the mean is $m = \alpha/(1-\phi)$. For the first model, we have that $\alpha = 3$ and $\phi = 0.5$ $\Rightarrow$ $m = 6$ (which is similar to what the sample mean of the first model is).

Note that the second model violates the requirement that $|\phi| < 1$. This kind of AR process is not stationary.

Trying to generate this kind of model using ArmaProcess is possible:

In [27]:
np.random.seed(100)
#Specify the model:
AR_m2_spec = smt.arima_process.ArmaProcess(ar = np.array([1, -1.2]), ma = [1])
#Check if it is stationary:
print("Stationarity: ", AR_m2_spec.isstationary)
plt.plot(AR_m2_spec.generate_sample(50))
plt.show()
Stationarity:  False

We can also immediately test for stationarity using the isstationary method. Eventhough it is not a stationary process, we can still generate it using the provided method.

The ACF and PACF:

In [28]:
fig = plt.figure(figsize=(14,5))

#ACF plots:
sm.graphics.tsa.plot_acf(AR_m1, lags = 20, ax = fig.add_subplot(231), title = '$Y_t = 3 + 0.5 Y_{t-1} + \epsilon_t$')
plt.ylabel('ACF')
sm.graphics.tsa.plot_acf(AR_m2, lags = 20, ax = fig.add_subplot(232), title = '$Y_t = 1.2 Y_{t-1} + \epsilon_t$')
plt.ylabel('ACF')
sm.graphics.tsa.plot_acf(AR_m3, lags = 20, ax = fig.add_subplot(233), title = '$Y_t = 1.2 Y_{t-1} - 0.5 Y_{t-2} + \epsilon_t$')
plt.ylabel('ACF')

#PACF plots:
sm.graphics.tsa.plot_pacf(AR_m1, lags = 20, ax = fig.add_subplot(234), title = '$Y_t = 3 + 0.5 Y_{t-1} + \epsilon_t$')
plt.ylabel('PACF')
sm.graphics.tsa.plot_pacf(AR_m2, lags = 20, ax = fig.add_subplot(235), title = '$Y_t = 1.2 Y_{t-1} + \epsilon_t$')
plt.ylabel('PACF')
sm.graphics.tsa.plot_pacf(AR_m3, lags = 20, ax = fig.add_subplot(236), title = '$Y_t = 1.2 Y_{t-1} - 0.5 Y_{t-2} + \epsilon_t$')
plt.ylabel('PACF')
plt.tight_layout()

For the AR(p) models, the ACF decays gradually to zero. For the AR(1), the PACF has a sharp cutoff after lag = 1.

For the AR(2), the PACF has a sharp cutoff after lag = 2.

Note that even though the second model is not stationary, the PACF is cut-off after the first lag and the ACF is slowly decaying (because the coefficient $\phi > 1$ the decay is slower compared to the first model with $\phi = 0.5$)

Autoregressive Moving-Average ARMA(p,q) Models

We will generate an ARMA model:

  • $Y_t = 0.4 Y_{t-1} + \epsilon_t + 0.6 \epsilon_{t-1}$
In [29]:
arparams = np.array([0.4])
maparams = np.array([0.6])

The conventions of the ArmaProcess function require that we specify a 1 for the zero-lag of the AR and MA parameters and that the AR parameters be negated, see the documentation:

Both the AR and MA components should include the coefficient on the zero-lag. This is typically 1. Further, due to the conventions used in signal processing used in signal.lfilter vs. conventions in statistics for ARMA processes, the AR paramters should have the opposite sign of what you might expect

In [30]:
arparams = np.r_[1, -arparams] # a quick way to build arrays quickly - by using "r_"
maparams = np.r_[1, maparams]

#Specify the model:
ARMA_spec = smt.arima_process.ArmaProcess(ar = arparams, ma = maparams)
#Check if it is stationary:
print("Stationarity: ", ARMA_spec.isstationary)
#Check if it is invertible
print("Invertibility: ", ARMA_spec.isinvertible)
#Generate the data sample:
ARMA_m1 = ARMA_spec.generate_sample(n)
Stationarity:  True
Invertibility:  True

The ACF and PACF:

In [31]:
fig = plt.figure(figsize=(10,5))
#ACF plots:
sm.graphics.tsa.plot_acf(ARMA_m1, lags = 20, ax = fig.add_subplot(211), title = '$Y_t = 0.4 Y_{t-1} + \epsilon_t + 0.6 \epsilon_{t-1}$')
plt.ylabel('ACF')
#PACF plots:
sm.graphics.tsa.plot_pacf(ARMA_m1, lags = 20, ax = fig.add_subplot(212), title = '$Y_t = 0.4 Y_{t-1} + \epsilon_t + 0.6 \epsilon_{t-1}$')
plt.ylabel('PACF')
plt.tight_layout()

Both ACF and PACF decline for the ARMA process.

Model Estimation with statsmodels.tsa.arima_model.ARMA

Let’s say we do not know the data generating process that generated out previously mentioned AR, MA and ARMA models. If we examined the ACF and PACF graphs and we have an idea of what our underlying model might be, we can use the tsa.arima_model.ARMA

In [32]:
MA_m1_est = smt.arima_model.ARMA(MA_m1, order = (0, 1))
MA_m1_mdl = MA_m1_est.fit(trend = 'c')

MA_m2_est = smt.arima_model.ARMA(MA_m2, order = (0, 1))
MA_m2_mdl = MA_m2_est.fit(trend = 'nc')

Because the third MA process is not invertible, we need to specify transparams = False, otherwise we will get an error. We also need to specify some starting aprameters. For this example, we select start_params = [-1, 0.5] - they are somehwt close to the parameters which we used to generate the data:

In [33]:
MA_m3_est = smt.arima_model.ARMA(MA_m3, order = (0, 2))
MA_m3_mdl = MA_m3_est.fit(trend = 'nc', transparams = False, start_params = [-1, 0.5]) 
In [34]:
AR_m1_est = smt.arima_model.ARMA(AR_m1, order = (1, 0))
AR_m1_mdl = AR_m1_est.fit(trend = 'c')

AR_m2_est = smt.arima_model.ARMA(AR_m2, order = (1, 0))
AR_m2_mdl = AR_m2_est.fit(trend = 'nc')

AR_m3_est = smt.arima_model.ARMA(AR_m3, order = (2, 0))
AR_m3_mdl = AR_m3_est.fit(trend = 'nc')

ARMA_m1_est = smt.arima_model.ARMA(ARMA_m1, order = (1, 1))
ARMA_m1_mdl = ARMA_m1_est.fit(trend = 'nc')

The summary statistics of our models:

In [35]:
print(MA_m1_mdl.summary())
                              ARMA Model Results                              
==============================================================================
Dep. Variable:                      y   No. Observations:                 5000
Model:                     ARMA(0, 1)   Log Likelihood               -7181.051
Method:                       css-mle   S.D. of innovations              1.017
Date:                Mon, 19 Feb 2018   AIC                          14368.103
Time:                        22:49:08   BIC                          14387.654
Sample:                             0   HQIC                         14374.955
                                                                              
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          3.0129      0.022    139.766      0.000       2.971       3.055
ma.L1.y        0.4983      0.012     41.547      0.000       0.475       0.522
                                    Roots                                    
=============================================================================
                 Real           Imaginary           Modulus         Frequency
-----------------------------------------------------------------------------
MA.1           -2.0067           +0.0000j            2.0067            0.5000
-----------------------------------------------------------------------------

We can extract specific information from this class:

In [36]:
print("Included trend (this is actually the constant): ", MA_m1_mdl.k_trend)
print("Included constat: ", MA_m1_mdl.k_constant)
print("Number of AR parameters: ", MA_m1_mdl.k_ar)
print("Number of MA parameters" , MA_m1_mdl.k_ma)
print("Included exogenous variables: ", MA_m1_mdl.k_exog)

print("All of the parameters: ", MA_m1_mdl.params)
print("All of the p-values: ", MA_m1_mdl.pvalues)
print("Only AR parameters: ", MA_m1_mdl.arparams)
print("Only MA parameters: ", MA_m1_mdl.maparams)
Included trend (this is actually the constant):  1
Included constat:  0
Number of AR parameters:  0
Number of MA parameters 1
Included exogenous variables:  0
All of the parameters:  [ 3.01290667  0.49833892]
All of the p-values:  [ 0.  0.]
Only AR parameters:  []
Only MA parameters:  [ 0.49833892]

If we only try to extract the coefficients:

In [37]:
print(ARMA_m1_mdl.params)
[ 0.41564733  0.58973942]

We do not get the names of the coefficients (at least this is the case with the current working environment).

To only leave the relevant info, we only take the coefficients and their standard errors and define the following method to get the coefficient names:

In [38]:
def get_coef(my_fitted_model):
    #Get the parameters and their pvalues
    rez = np.array([my_fitted_model.params, my_fitted_model.pvalues]).T
    #Set the parameter names
    my_index = None
    #Check for specific parameters:
    if my_fitted_model.k_trend > 0:
       my_index = np.append(my_index, "Const")
    if my_fitted_model.k_ar > 0:
       my_index = np.append(my_index, ["AR_L" + str(number) for number in range(1, my_fitted_model.k_ar + 1)]) 
    if my_fitted_model.k_ma > 0:
       my_index = np.append(my_index, ["MA_L" + str(number) for number in range(1, my_fitted_model.k_ma + 1)]) 
    #return a pandas DataFrame
    df = pd.DataFrame(rez)#, index = my_index, columns = ["coef", "p_value"])
    df.index = my_index[1:]
    df.columns = ["coef", "p_value"]
    return df

Now we can use this method to extract only the relevant model information:

In [39]:
get_coef(MA_m1_mdl)
Out[39]:
coef p_value
Const 3.012907 0.0
MA_L1 0.498339 0.0
In [40]:
get_coef(MA_m2_mdl)
Out[40]:
coef p_value
MA_L1 0.763487 0.0
In [41]:
get_coef(MA_m3_mdl)
Out[41]:
coef p_value
MA_L1 -1.291157 0.000000e+00
MA_L2 0.388276 4.402995e-182

We can see that our estimated coefficients are close to the actual values, except for the second process, which is not invertible.

For AR models:

In [42]:
get_coef(AR_m1_mdl)
Out[42]:
coef p_value
Const 6.015188 0.0
AR_L1 0.505785 0.0
In [43]:
get_coef(AR_m2_mdl)
Out[43]:
coef p_value
AR_L1 0.998995 8.484340e-185
In [44]:
get_coef(AR_m3_mdl)
Out[44]:
coef p_value
AR_L1 1.169103 0.000000e+00
AR_L2 -0.487480 6.058653e-297

Note that the second process is not stationary. As such the estimated value of $\phi$ is close to 1.

When estimating AR, MA or ARMA processes we assume that they are stationary (if they are not, then we need to create such models or transform the data such that it would be stationary).

Note that the Const coefficient of the first $AR(1)$ model is close to the process mean m = 6.

In [45]:
get_coef(ARMA_m1_mdl)
Out[45]:
coef p_value
AR_L1 0.415647 1.021136e-142
MA_L1 0.589739 0.000000e+00

Automated model estimation

If we are not sure what order of AR, MA or ARMA model would best fit our data, we could use an automated process to select the best possible model (in terms of either AIC or BIC), if such a method is available.

The closest equivalent to R's auto.arima for use in Python is the statsmodels.tsa.stattools.arma_order_select_ic module.

For example, it works with our second and third MA process data samples, one of which is not invertible:

In [46]:
MA_m1_est_auto = sm.tsa.arma_order_select_ic(MA_m1, ic='aic', trend='c')
print("1st MA process order: ", MA_m1_est_auto.aic_min_order)
MA_m2_est_auto = sm.tsa.arma_order_select_ic(MA_m2, ic='aic', trend='nc')
print("2nd MA process order: ", MA_m2_est_auto.aic_min_order)
MA_m3_est_auto = sm.tsa.arma_order_select_ic(MA_m3, ic='aic', trend='nc')
print("3rd MA process order: ", MA_m3_est_auto.aic_min_order)
1st MA process order:  (0, 1)
2nd MA process order:  (0, 1)
3rd MA process order:  (0, 2)

Unfortunately, this method is not fully implemented and sometimes errors may arise (especially when the process is not stationary). With our AR process data:

In [47]:
AR_m1_est_auto = sm.tsa.arma_order_select_ic(AR_m1, ic='aic', trend='c')
print("1st AR process order: ", AR_m1_est_auto.aic_min_order)
1st AR process order:  (1, 0)
In [48]:
AR_m2_est_auto = sm.tsa.arma_order_select_ic(AR_m2, ic='aic', trend='nc')
print("2nd AR process order: ", AR_m2_est_auto.aic_min_order)
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:654: RuntimeWarning: divide by zero encountered in log
  invmacoefs = -np.log((1-macoefs)/(1+macoefs))
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:628: RuntimeWarning: overflow encountered in exp
  newparams = ((1-np.exp(-params))/(1+np.exp(-params))).copy()
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:628: RuntimeWarning: invalid value encountered in true_divide
  newparams = ((1-np.exp(-params))/(1+np.exp(-params))).copy()
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:629: RuntimeWarning: overflow encountered in exp
  tmp = ((1-np.exp(-params))/(1+np.exp(-params))).copy()
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:629: RuntimeWarning: invalid value encountered in true_divide
  tmp = ((1-np.exp(-params))/(1+np.exp(-params))).copy()
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tools\numdiff.py:243: RuntimeWarning: invalid value encountered in multiply
  **kwargs)).imag/2./hess[i, j]
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\kalmanf\kalmanfilter.py:654: RuntimeWarning: invalid value encountered in true_divide
  R_mat, T_mat)
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\base\model.py:473: HessianInversionWarning: Inverting hessian failed, no bse or cov_params available
  'available', HessianInversionWarning)
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\base\model.py:496: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
  "Check mle_retvals", ConvergenceWarning)
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:612: RuntimeWarning: divide by zero encountered in log
  invarcoefs = -np.log((1-params)/(1+params))
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:584: RuntimeWarning: overflow encountered in exp
  newparams = ((1-np.exp(-params))/
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:585: RuntimeWarning: overflow encountered in exp
  (1+np.exp(-params))).copy()
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:585: RuntimeWarning: invalid value encountered in true_divide
  (1+np.exp(-params))).copy()
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:586: RuntimeWarning: overflow encountered in exp
  tmp = ((1-np.exp(-params))/
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:587: RuntimeWarning: overflow encountered in exp
  (1+np.exp(-params))).copy()
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:587: RuntimeWarning: invalid value encountered in true_divide
  (1+np.exp(-params))).copy()
2nd AR process order:  (1, 2)
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:612: RuntimeWarning: invalid value encountered in log
  invarcoefs = -np.log((1-params)/(1+params))

Because the second AR process is not stationary, we get a long list of errors and the closest stationary model for this dataset.

In [49]:
AR_m3_est_auto = sm.tsa.arma_order_select_ic(AR_m3, ic='aic', trend='nc')
print("3rd AR process order: ", AR_m3_est_auto.aic_min_order)
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\base\model.py:496: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
  "Check mle_retvals", ConvergenceWarning)
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\base\model.py:496: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
  "Check mle_retvals", ConvergenceWarning)
3rd AR process order:  (3, 1)
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\base\model.py:496: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
  "Check mle_retvals", ConvergenceWarning)

If we check our third AR process, which we generated with smt.arima_process.ArmaProcess, we can see that it is both stationary and invertible:

In [50]:
print("Invertibility: ", AR_m3_spec.isinvertible)
print("Stationarity:  ", AR_m3_spec.isstationary)
Invertibility:  True
Stationarity:   True

So, the automatic model selection is not always correct. In the third AR process case we got a convergence warning when performing the likelihood function optimization.

With our ARMA process data the model order is found without any warnings or errors:

In [51]:
ARMA_m1_est_auto = sm.tsa.arma_order_select_ic(ARMA_m1, ic='aic', trend='nc')
print("ARMA process order: ", ARMA_m1_est_auto.aic_min_order)
ARMA process order:  (1, 1)

The output is the order of the $ARMA(p, q)$ model. We can then use the previously used smt.arima_model.ARMA module with the specified order. As mentioned, the calculations take longer compared to R's auto.arima.


Model Forecasts

Having estimated the models, we would like to create our time series forecasts:

Lets say we want to forecast 5 periods into the future. Since we have 5000 historic data points we would also want to reduce the historic data in the plot to, say 20.

Our code would look like:

In [52]:
fig = plt.figure(figsize=(14,8))

MA_m1_mdl.plot_predict(start = MA_m1.size - 20, end = MA_m1.size + 5, ax = fig.add_subplot(331)) #ax = Existing axes to plot with.
plt.title('$MA:\ Y_t = 3 + \epsilon_t + 0.5 \epsilon_{t-1}$')
plt.legend().remove()

MA_m2_mdl.plot_predict(start = MA_m2.size - 20, end = MA_m2.size + 5, ax = fig.add_subplot(332)) #ax = Existing axes to plot with.
plt.title('$MA:\ Y_t = \epsilon_t + 1.3 \epsilon_{t-1}$')
plt.legend().remove()

MA_m3_mdl.plot_predict(start = MA_m3.size - 20, end = MA_m3.size + 5, ax = fig.add_subplot(333)) #ax = Existing axes to plot with.
plt.title('$MA:\ Y_t = \epsilon_t - 1.3 \epsilon_{t-1} + 0.4 \epsilon_{t-2}$')
plt.legend().remove()

AR_m1_mdl.plot_predict(start = AR_m1.size - 20, end = AR_m1.size + 5, ax = fig.add_subplot(334)) #ax = Existing axes to plot with.
plt.title('$AR:\ Y_t = 3 + 0.5 Y_{t-1} + \epsilon_t$')
plt.legend().remove()

AR_m2_mdl.plot_predict(start = AR_m2.size - 20, end = AR_m2.size + 5, ax = fig.add_subplot(335)) #ax = Existing axes to plot with.
plt.title('$AR:\ Y_t = 1.2 Y_{t-1} + \epsilon_t$')
plt.legend().remove()

AR_m3_mdl.plot_predict(start = AR_m3.size - 20, end = AR_m3.size + 5, ax = fig.add_subplot(336)) #ax = Existing axes to plot with.
plt.title('$AR:\ Y_t = 1.2 Y_{t-1} - 0.5 Y_{t-2} + \epsilon_t$')
plt.legend().remove()

ARMA_m1_mdl.plot_predict(start = ARMA_m1.size - 20, end = ARMA_m1.size + 5, ax = fig.add_subplot(313)) #ax = Existing axes to plot with.
plt.title('$ARMA:\ Y_t = 0.4 Y_{t-1} + \epsilon_t + 0.6 \epsilon_{t-1}$')
plt.legend(loc = 'upper center')

plt.tight_layout()

Because the 2nd AR process is not stationary, the forecasts differ quite a bit from the historical data.

Otherwise, AR and MA model forecasts perform as we would expect: MA forecasts quickly tend to the mean, whereas AR forecasts tend to the average but never reach it.


Model residuals

After estimating our models, it is also important to check if the residuals are WN. It would be usefull to have an equivalent to tsdisplay from R, so let's define a function:

In [53]:
def tsdisplay(y, figsize = (14,8), title = ""):
    tmp_data = pd.Series(y)
    
    fig = plt.figure(figsize = figsize)
    #Plot the time series
    tmp_data.plot(ax = fig.add_subplot(311), title = "$Time\ Series\ " + title + "$", legend = False)
    #Plot the ACF:
    sm.graphics.tsa.plot_acf(tmp_data, lags = 20, zero = False, ax = fig.add_subplot(323))
    #Plot the PACF:
    sm.graphics.tsa.plot_pacf(tmp_data, lags = 20, zero = False, ax = fig.add_subplot(324))
    #Plot the QQ plot of the data:
    sm.qqplot(tmp_data, line='s', ax = fig.add_subplot(325)) 
    plt.title("QQ Plot")
    #Plot the residual histogram:
    fig.add_subplot(326).hist(tmp_data, bins = 50, normed = 1)
    plt.title("Histogram")
    #Fix the layout of the plots:
    plt.tight_layout()

Firstly, let us look at our $ARMA(1,1)$ model:

In [54]:
tsdisplay(ARMA_m1_mdl.resid, title = ":\ Residuals\ Of\ Y_t = 0.4 Y_{t-1} + \epsilon_t + 0.6 \epsilon_{t-1}")

The residuals appear to not be autocorrelated. The QQ plot suggests that they are standard normal. We can conclude that these residuals are $WN$.

Now, let's look at the MA process which is not invertible:

In [55]:
tsdisplay(MA_m2_mdl.resid, title = ":\ Residuals\ Of\ Y_t = \epsilon_t + 1.3 \epsilon_{t-1}")

As stated in the theory slides, the MA process is stationary, so the residuals are $WN$ regardless if the process is invertible or not.

Finally, lets look at the nonstationary AR process:

In [56]:
tsdisplay(AR_m2_mdl.resid, title = ":\ Residuals\ Of\ Y_t = 1.2 Y_{t-1} + \epsilon_t")

Because we are fitting a stationary AR model on a nonstationary time series and in doing so, we misspecify the model, the residuals are not $WN$.

We can also carry out the Ljung-Box tests.

For the ARMA series:

In [57]:
pd.DataFrame(list(sm_stat.diagnostic.acorr_ljungbox(pd.Series(ARMA_m1_mdl.resid), lags = 10)), index = ["X-squared", "p-value"], columns = range(1, 11))
Out[57]:
1 2 3 4 5 6 7 8 9 10
X-squared 0.078754 0.127692 0.934005 0.959198 1.112731 1.717695 3.449466 5.938348 6.520978 6.718923
p-value 0.778994 0.938149 0.817215 0.915918 0.952971 0.943747 0.840549 0.654138 0.686854 0.751687

We can see that we do not reject the null hypothesis of an uncorrelated series.

For the non-invertible MA process:

In [58]:
pd.DataFrame(list(sm_stat.diagnostic.acorr_ljungbox(pd.Series(MA_m2_mdl.resid), lags = 10)), index = ["X-squared", "p-value"], columns = range(1, 11))
Out[58]:
1 2 3 4 5 6 7 8 9 10
X-squared 0.811018 0.908880 1.713492 1.875256 2.063945 2.460120 4.061034 6.020079 6.569315 6.625401
p-value 0.367819 0.634803 0.633938 0.758689 0.840228 0.872904 0.772721 0.644983 0.681860 0.760273

Because p-value > 0.05 for the first 10 lags, we do not reject the null hypothesis of an uncorrelated series.

For the nonstationary AR process:

In [59]:
pd.DataFrame(list(sm_stat.diagnostic.acorr_ljungbox(pd.Series(AR_m2_mdl.resid), lags = 10)), index = ["X-squared", "p-value"], columns = range(1, 11))
Out[59]:
1 2 3 4 5 6 7 8 9 10
X-squared 7.133671e+01 1.211737e+02 1.559399e+02 1.801462e+02 1.969576e+02 2.085948e+02 2.166157e+02 2.221134e+02 2.258541e+02 2.283752e+02
p-value 3.011802e-17 4.869366e-27 1.378034e-33 6.936529e-38 1.271109e-40 2.805879e-42 3.448375e-43 1.377086e-43 1.227728e-43 1.881810e-43

The p-value < 0.05, we reject the null hypothesis, which means that the residuals are serially correlated (i.e. autocorrelated).


Model misspecification

Another example, this time by fitting an incorrect model.

Remember our MA(1) model which is not invertible $Y_t = \epsilon_t + 1.3 \epsilon_{t-1}$?

Let us try to fit an AR(1) model and check its residuals:

In [60]:
bad_est = smt.arima_model.ARMA(MA_m2, order = (1, 0))
bad_mdl = bad_est.fit(trend = 'nc')

tsdisplay(bad_mdl.resid, title = ":\ Misspecified\ model")
In [61]:
pd.DataFrame(list(sm_stat.diagnostic.acorr_ljungbox(pd.Series(bad_mdl.resid), lags = 10)), index = ["X-squared", "p-value"], columns = range(1, 11))
Out[61]:
1 2 3 4 5 6 7 8 9 10
X-squared 1.175365e+02 5.746887e+02 5.758068e+02 5.765802e+02 5.766548e+02 5.775478e+02 5.797327e+02 5.827731e+02 5.836104e+02 5.841349e+02
p-value 2.190348e-27 1.614143e-125 1.770036e-124 1.813612e-123 2.235928e-122 1.622522e-121 5.627523e-121 1.180791e-120 6.886772e-120 4.409622e-119

We can see that while the residuals appear to be standard normal, they are actually autocorrelated. This can be seen from both the ACF/PACF as well as from the Ljung-Box test results, where the p-value < 0.05 for different lags leads us to reject the null hypothesis of no autocorrelation.




Financial Volatility Modelling

In this section we will present an example of esimating ARCH and GARCH models with Python. We will use the arch_model function from the arch package.

To install the package, follow the documentation. Open the Anaconda Navigator, Go to 'Environments' and click the arrow near the base(root). Then enter the following:

conda install arch -c bashtage

Note: OS should be 64-bit!

Now we can import the required package:

In [62]:
import arch as arch
#arch.doc()

We can think of $ARCH$ models as $AR$ models applied to the error variance of a time series $r_t$.

$ARCH(1)$ process:

$$ \begin{cases} r_t &= \mu + \epsilon_t \\ \epsilon_t &= \sigma_t w_t \\ \sigma_t^2 &= \omega + \alpha_1 \epsilon^2_{t-1} \end{cases} $$

In [63]:
np.random.seed(123)

#Define the sample size:
n = 2000

# Define the coefficients:
mu = 1.0
omega = 2.0
alpha = 0.5

# Generate the random shocks
w = np.random.normal(size = n)

# Generate the ARCH process:
r_t = np.array(mu + np.sqrt(omega) * w[0], ndmin = 1)
for j in range(1, n):
    r_t = np.append(r_t, mu + np.sqrt(omega + alpha * ((r_t[j-1] - mu)**2)) * w[j])

First, we plot the ACF and PACF of the time series:

In [64]:
tsdisplay(r_t, title = "\ of\ r_t")

The first lags appear to be slightly correlated. Now let's plot the squared time series:

In [65]:
tsdisplay(r_t**2, title = "\ of\ r_t^2")

If we wanted to estimate an ARCH model on this data, we would first have to create a mean model for the returns r_t. Then we should inspect the residuals - if their ACF and PACF plots show no autocorrelation but the squared residual ACF and PACF plots - show autocorrelation - we should create an ARCH model for the residuals. Look at the theory slides for the detailed steps and try to to them for this model.

Use the following example of a $GARCH$ model and apply the same steps to the $ARCH$ process.

$GARCH(1, 1)$ is an $ARMA$ model applied to the error variance of a time series:

$$ \begin{cases} r_t &= \mu + \epsilon_t \\ \epsilon_t &= \sigma_t w_t \\ \sigma_t^2 &= \omega + \alpha_1 \epsilon^2_{t-1} + \beta_1 \sigma^2_{t-1} \end{cases} $$

In [66]:
np.random.seed(1)

#Define the sample size:
n = 2000

# Define the coefficients:
mu = 1.0
omega = 0.2
alpha = 0.5
beta  = 0.3

# Generate the random shocks
w = np.random.normal(size = n)
sigma_sq = np.zeros_like(w)

# Generate the GARCH process:
sigma_sq[0] = omega
r_t = np.array(mu + np.sqrt(sigma_sq[0]) * w[0], ndmin = 1)

for j in range(1, n):
    sigma_sq[j] = omega + alpha * ((r_t[j-1] - mu)**2 + beta * sigma_sq[j-1])
    r_t = np.append(r_t, mu + np.sqrt(sigma_sq[j]) * w[j])
In [67]:
tsdisplay(r_t, title = "\ of\ r_t")
In [68]:
tsdisplay(r_t**2, title = "\ of\ r_t^2")

We note, that the residuals are WN but the squared residuals are not.

We begin by creating the model for the mean by looping through all possible ARMA(p, q) model combinations with $p_{max} = q_{max} = 4$:

In [69]:
best_aic = np.inf 
best_order = None
best_mdl = None

pq_rng = range(5) # [0,1,2,3,4]
for i in pq_rng:
    for j in pq_rng:
        try:
            tmp_mdl = smt.api.ARIMA(r_t, order = (i,0,j)).fit(method = 'mle', trend = 'c')
            tmp_aic = tmp_mdl.aic
            if tmp_aic < best_aic:
                best_aic = tmp_aic
                best_order = (i, 0, j)
                best_mdl = tmp_mdl
        except: continue
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:584: RuntimeWarning: overflow encountered in exp
  newparams = ((1-np.exp(-params))/
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:585: RuntimeWarning: overflow encountered in exp
  (1+np.exp(-params))).copy()
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:585: RuntimeWarning: invalid value encountered in true_divide
  (1+np.exp(-params))).copy()
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:586: RuntimeWarning: overflow encountered in exp
  tmp = ((1-np.exp(-params))/
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:587: RuntimeWarning: overflow encountered in exp
  (1+np.exp(-params))).copy()
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:587: RuntimeWarning: invalid value encountered in true_divide
  (1+np.exp(-params))).copy()
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:628: RuntimeWarning: overflow encountered in exp
  newparams = ((1-np.exp(-params))/(1+np.exp(-params))).copy()
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:628: RuntimeWarning: invalid value encountered in true_divide
  newparams = ((1-np.exp(-params))/(1+np.exp(-params))).copy()
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:629: RuntimeWarning: overflow encountered in exp
  tmp = ((1-np.exp(-params))/(1+np.exp(-params))).copy()
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:629: RuntimeWarning: invalid value encountered in true_divide
  tmp = ((1-np.exp(-params))/(1+np.exp(-params))).copy()
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\base\model.py:473: HessianInversionWarning: Inverting hessian failed, no bse or cov_params available
  'available', HessianInversionWarning)
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\base\model.py:496: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
  "Check mle_retvals", ConvergenceWarning)
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\base\model.py:496: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
  "Check mle_retvals", ConvergenceWarning)
C:\Users\Andrius\Anaconda3\lib\site-packages\statsmodels\base\model.py:496: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
  "Check mle_retvals", ConvergenceWarning)
In [70]:
print("The best ARIMA model is the ARIMA(%d, %d, %d)" % best_order)
#mean_model = smt.api.ARIMA(r_t, order = best_order).fit(method = 'mle', trend = 'c')
#tsdisplay(mean_model.resid, title = "\ of\ residuals")
#tsdisplay(mean_model.resid**2, title = "\ of\ residuals-squared")
print("The mean of the series is ", r_t.mean())
The best ARIMA model is the ARIMA(0, 0, 0)
The mean of the series is  1.0192672836

As we have seen from the ACF and PACF plot of r_t, there is no need to create an ARMA model for the mean model and we got the same results from our ARMA order selection code.

While the model residuals appear to be $WN$, the squared residuals exhibit autocorrelation, so we need to fit a GARCH model on the series. Note, however, that the mean of the series is not zero, so we need to fit a mean model with a constant:

In [71]:
# First, specify the mean model - possible either 'ConstantMean()' or 'ZeroMean()':
am = arch.univariate.ConstantMean(r_t)
# Second, specify the volatility model:
am.volatility = arch.univariate.GARCH(p = 1, q = 1)
# Finally, specify the residual distribution:
am.distribution = arch.univariate.Normal()
output = am.fit()
Iteration:      1,   Func. Count:      6,   Neg. LLF: 2109.82885006753
Iteration:      2,   Func. Count:     16,   Neg. LLF: 2109.662128949584
Iteration:      3,   Func. Count:     23,   Neg. LLF: 2079.9254071764544
Iteration:      4,   Func. Count:     30,   Neg. LLF: 2074.1544459915485
Iteration:      5,   Func. Count:     37,   Neg. LLF: 2060.8541827726026
Iteration:      6,   Func. Count:     44,   Neg. LLF: 2052.451514595776
Iteration:      7,   Func. Count:     51,   Neg. LLF: 2048.854000342623
Iteration:      8,   Func. Count:     58,   Neg. LLF: 2047.7208398743273
Iteration:      9,   Func. Count:     64,   Neg. LLF: 2047.4171635170721
Iteration:     10,   Func. Count:     70,   Neg. LLF: 2047.3865316526096
Iteration:     11,   Func. Count:     76,   Neg. LLF: 2047.3831321330633
Iteration:     12,   Func. Count:     82,   Neg. LLF: 2047.3830591941799
Optimization terminated successfully.    (Exit mode 0)
            Current function value: 2047.3830585248247
            Iterations: 12
            Function evaluations: 83
            Gradient evaluations: 12
In [72]:
print(output.summary())
                     Constant Mean - GARCH Model Results                      
==============================================================================
Dep. Variable:                      y   R-squared:                      -0.000
Mean Model:             Constant Mean   Adj. R-squared:                 -0.000
Vol Model:                      GARCH   Log-Likelihood:               -2047.38
Distribution:                  Normal   AIC:                           4102.77
Method:            Maximum Likelihood   BIC:                           4125.17
                                        No. Observations:                 2000
Date:                Mon, Feb 19 2018   Df Residuals:                     1996
Time:                        22:50:27   Df Model:                            4
                               Mean Model                               
========================================================================
                 coef    std err          t      P>|t|  95.0% Conf. Int.
------------------------------------------------------------------------
mu             1.0155  1.247e-02     81.423      0.000 [  0.991,  1.040]
                            Volatility Model                            
========================================================================
                 coef    std err          t      P>|t|  95.0% Conf. Int.
------------------------------------------------------------------------
omega          0.1809  1.725e-02     10.493  9.336e-26 [  0.147,  0.215]
alpha[1]       0.5118  4.372e-02     11.705  1.199e-31 [  0.426,  0.597]
beta[1]        0.1924  3.523e-02      5.462  4.698e-08 [  0.123,  0.261]
========================================================================

Covariance estimator: robust

We can look at the coefficients and see that they are similar to the ones we used to generate the series ($\beta$ coefficient differs the most):

In [73]:
print(output.params)
print(output.aic)
mu          1.015481
omega       0.180947
alpha[1]    0.511791
beta[1]     0.192441
Name: params, dtype: float64
4102.766117049649

The evaluated model is:

$$ \begin{cases} r_t &= 1.015481 + \epsilon_t \\ \epsilon_t &= \sigma_t w_t \\ \sigma_t^2 &= 0.180947 + 0.511791 \epsilon^2_{t-1} + 0.192441 \sigma^2_{t-1} \end{cases} $$

If we needed, we could also estimate an $AR(1)-GARCH(1, 1)$ model:

In [74]:
am_tmp = arch.univariate.ARX(r_t, lags = 1)
am_tmp.volatility = arch.univariate.GARCH(p = 1, q = 1)
am_tmp.distribution = arch.univariate.Normal()
tmp_mdl = am_tmp.fit(disp = 'off')
In [75]:
print(tmp_mdl.params)
print(tmp_mdl.aic)
Const       1.021405
y[1]       -0.006137
omega       0.181230
alpha[1]    0.511749
beta[1]     0.191995
Name: params, dtype: float64
4102.641587502838

In this case the coefficients are very similar to the actual ones, with an additional $\phi_1 = - 0.006137$ coefficient in the mean equation:

$$ \begin{cases} r_t &= 1.021405 -0.006137 r_{t-1} + \epsilon_t \\ \epsilon_t &= \sigma_t w_t \\ \sigma_t^2 &= 0.181230 + 0.511749 \epsilon^2_{t-1} + 0.191995 \sigma^2_{t-1} \end{cases} $$

We can see that the $AIC$ is slightly smaller for the $AR(1)-GARCH(1,1)$ model, compared to the $GARCH(1, 1)$ model, which would indicate that the second model is slightly better in terms of $AIC$ value.

We can also look at the standardized residuals:

In [76]:
std_resid = tmp_mdl.resid[1:] / tmp_mdl.conditional_volatility[1:]
tsdisplay(std_resid, title = "\ of\ standardized\ residuals")
tsdisplay(std_resid**2, title = "\ of\ standardized\ residuals-squared")



Getting some data from the internet

We can use the pandas_datareader package to get some sample data from the web using the Yahoo Finance API. Package Page.

Open the Anaconda Navigator, Go to 'Environments' and click the arrow near the base(root). Then enter the following:

conda install -c anaconda pandas-datareader

In [78]:
import pandas_datareader.data as web

After this, we can download some stock data

In [79]:
start = '2010-01-01'
end   = '2018-01-01'
GOOG = web.DataReader('GOOG', 'yahoo', start=start, end=end)
In [80]:
GOOG.head(10)
Out[80]:
Open High Low Close Adj Close Volume
Date
2009-12-31 310.356445 310.679321 307.986847 307.986847 307.986847 2455400
2010-01-04 311.449310 312.721039 310.103088 311.349976 311.349976 3937800
2010-01-05 311.563568 311.891449 308.761810 309.978882 309.978882 6048500
2010-01-06 310.907837 310.907837 301.220856 302.164703 302.164703 8009000
2010-01-07 302.731018 303.029083 294.410156 295.130463 295.130463 12912000
2010-01-08 294.087250 299.675903 292.651581 299.064880 299.064880 9509900
2010-01-11 300.276978 300.276978 295.100647 298.612823 298.612823 14519600
2010-01-12 296.893982 297.147339 292.100159 293.332153 293.332153 9769600
2010-01-13 286.382355 292.288940 285.095734 291.648102 291.648102 13077600
2010-01-14 290.063416 295.180145 289.521942 293.019196 293.019196 8535300

Then, we can calculate the returns:

In [81]:
returns = np.log(GOOG['Adj Close'] / GOOG['Adj Close'].shift(1)).dropna()
returns.head(10)
#print(GOOG['Adj Close'].head(2))
#print(GOOG['Adj Close'].shift(1).head(3))
#print(GOOG['Adj Close'].shift(-1).head(3))
Out[81]:
Date
2010-01-04    0.010861
2010-01-05   -0.004413
2010-01-06   -0.025532
2010-01-07   -0.023555
2010-01-08    0.013243
2010-01-11   -0.001513
2010-01-12   -0.017842
2010-01-13   -0.005758
2010-01-14    0.004690
2010-01-15   -0.016840
Name: Adj Close, dtype: float64
In [82]:
print(returns.plot())
AxesSubplot(0.125,0.2;0.775x0.68)

We can then use the data for model estimation. Some example commands:

In [83]:
arch_m = arch.arch_model(returns)
arch_fit = arch_m.fit()
Iteration:      1,   Func. Count:      6,   Neg. LLF: -5635.634269888038
Iteration:      2,   Func. Count:     19,   Neg. LLF: -5635.678837896405
Iteration:      3,   Func. Count:     32,   Neg. LLF: -5635.67905318484
Positive directional derivative for linesearch    (Exit mode 8)
            Current function value: -5635.679047994376
            Iterations: 7
            Function evaluations: 32
            Gradient evaluations: 3
C:\Users\Andrius\Anaconda3\lib\site-packages\arch\univariate\base.py:517: ConvergenceWarning: 
The optimizer returned code 8. The message is:
Positive directional derivative for linesearch
See scipy.optimize.fmin_slsqp for code meaning.

  ConvergenceWarning)
In [84]:
print(arch_fit.summary())
                     Constant Mean - GARCH Model Results                      
==============================================================================
Dep. Variable:              Adj Close   R-squared:                      -0.000
Mean Model:             Constant Mean   Adj. R-squared:                 -0.000
Vol Model:                      GARCH   Log-Likelihood:                5635.68
Distribution:                  Normal   AIC:                          -11263.4
Method:            Maximum Likelihood   BIC:                          -11240.9
                                        No. Observations:                 2013
Date:                Mon, Feb 19 2018   Df Residuals:                     2009
Time:                        22:50:55   Df Model:                            4
                                  Mean Model                                 
=============================================================================
                 coef    std err          t      P>|t|       95.0% Conf. Int.
-----------------------------------------------------------------------------
mu         5.1654e-04  3.479e-04      1.485      0.138 [-1.654e-04,1.198e-03]
                              Volatility Model                              
============================================================================
                 coef    std err          t      P>|t|      95.0% Conf. Int.
----------------------------------------------------------------------------
omega      2.2492e-05  1.128e-10  1.994e+05      0.000 [2.249e-05,2.249e-05]
alpha[1]       0.1000  5.475e-02      1.827  6.777e-02  [-7.305e-03,  0.207]
beta[1]        0.8000  4.670e-02     17.132  8.560e-66     [  0.708,  0.892]
============================================================================

Covariance estimator: robust

WARNING: The optimizer did not indicate sucessful convergence. The message was
Positive directional derivative for linesearch. See convergence_flag.
In [85]:
print(arch_fit.plot())
Figure(864x360)