## Theory

Generalized Autoregressive Score (GAS) models are a recent class of observation-driven time-series model for non-normal data. For a conditional observation density $p\left(y_{t}\mid{x_{t}}\right)$ with an observation $y_{t}$ and a latent time-varying parameter $x_{t}$, we assume the parameter $x_{t}$ follows the recursion:

$x_{t} = \mu + \sum^{p}_{i=1}\phi_{i}x_{t-i} + \sum^{q}_{j=1}\alpha_{j}S\left(x_{j-1}\right)\frac{\partial\log p\left(y_{t-j}\mid{x_{t-j}}\right) }{\partial{x_{t-j}}}$

For example, for a Poisson distribution density, where the default scaling is $\exp\left(x_{j}\right)$, the time-varying parameter follows:

$x_{t} = \mu + \sum^{p}_{i=1}\phi_{i}x_{t-i} + \sum^{q}_{j=1}\alpha_{j}\left(\frac{y_{t-j}}{\exp\left(x_{t-j}\right)} - 1\right)$

These types of model can be viewed as approximations to parameter-driven state space models, and are often competitive in predictive performance. See GAS State Space models for a more general class of models that extend beyond the simple autoregressive form. The simple GAS models considered here in this notebook can be viewed as an approximation to non-linear ARIMA processes.

## PyFlux

### Types of GAS Model

PyFlux supports many types of distribution for GAS modelling, including

PyFlux Class
Poisson GAS GASPoisson()
t GAS GASt()
Skew t GAS GASSkewt()
Normal GAS GASNormal()
Laplace GAS GASLaplace()
Exponential GAS GASExponential()

Below we demonstrate usage with an example for count data.

### Poisson GAS for Banking Crisis data

The data below records if a country somewhere in the world experiences a banking crisis in a given year.

import numpy as np
import pyflux as pf
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

numpy_data = np.sum(data.iloc[:,2:73].values,axis=1)
numpy_data[np.isnan(numpy_data)] = 0
financial_crises = pd.DataFrame(numpy_data)
financial_crises.index = data.year
financial_crises.columns = ["Number of banking crises"]

plt.figure(figsize=(15,5))
plt.plot(financial_crises)
plt.ylabel("Count")
plt.xlabel("Year")
plt.title("Number of banking crises across the world")
plt.show()

We will fit an arbitrary GAS(2,2) model to the data and specify the family as GASPoisson():

model = pf.GAS(ar=2,sc=2,data=financial_crises,family=pf.GASPoisson())
x = model.fit()
x.summary()

Poisson GAS(2,0,2)
======================================================= =================================================
Dependent Variable: Number of banking crises            Method: MLE
Start Date: 1802                                        Log Likelihood: -473.5316
End Date: 2010                                          AIC: 957.0632
Number of observations: 209                             BIC: 973.7748
=========================================================================================================
Latent Variable                          Estimate   Std Error  z        P>|z|    95% C.I.
======================================== ========== ========== ======== ======== =========================
Constant                                 0.0        0.0144     0.0      1.0      (-0.0282 | 0.0282)
AR(1)                                    0.4144     1.0631     0.3898   0.6967   (-1.6693 | 2.498)
AR(2)                                    0.5383     0.9959     0.5405   0.5889   (-1.4136 | 2.4902)
SC(1)                                    0.2465     0.023      10.7356  0.0      (0.2015 | 0.2916)
SC(2)                                    0.0725     0.2553     0.2841   0.7763   (-0.4278 | 0.5728)
=========================================================================================================


We can plot the latent variables using plot_z:

model.plot_z(figsize=(15,5))


We can plot the model fit using plot_fit:

model.plot_fit(figsize=(15,10))


For in-sample prediction we can use plot_predict_is. The fit_once argument specifies whether to fit the model once, then predict, or fit the model after each time step (rolling):

model.plot_predict_is(h=20, fit_once=True, figsize=(15,5))


If we want to see model forecasts, we can use plot_predict:

model.plot_predict(h=10, past_values=30, figsize=(15,5))


To output the data in DataFrame format, we use predict:

model.predict(10)

Number of banking crises
2011 9.253661
2012 8.173014
2013 7.910106
2014 7.299097
2015 6.936803
2016 6.504375
2017 6.162003
2018 5.820292
2019 5.521255
2020 5.238532