## Theory

Vovk’s Aggregating Algorithm is an algorithm for prediction with expert advice. Each expert makes predictions in each period and a loss function judges the strength of these predictions. The loss function for each expert then decays their weight through an exponentiated weighted scheme. So better experts gain more weight over time.

Suppose we have experts, then we initialize the algorithm with equal weights , and update the weights at each period according to:

Where is a learning rate, and is a loss function for the expert’s prediction in , e.g. a squared loss function. The weights can be normalized to sum to one to produce the ensemble prediction which combines each expert’s predictions and weights.

## PyFlux

First we import dependencies:

```
import numpy as np
import pyflux as pf
from pandas.io.data import DataReader
from datetime import datetime
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
```

We will estimate some models on US GDP growth data:

```
growthdata = pd.read_csv('http://www.pyflux.com/notebooks/GDPC1.csv')
USgrowth = pd.DataFrame(np.diff(np.log(growthdata['VALUE']))[149:len(growthdata['VALUE'])])
USgrowth.index = pd.to_datetime(growthdata['DATE'].values[1+149:len(growthdata)])
USgrowth.columns = ['US Real GDP Growth']
plt.figure(figsize=(15,5))
plt.plot(USgrowth)
plt.ylabel('Real GDP Growth')
plt.title('US Real GDP Growth');
```

We will use the following models:

```
model1 = pf.ARIMA(data=USgrowth, ar=4, ma=0)
model2 = pf.ARIMA(data=USgrowth, ar=8, ma=0)
model3 = pf.LLEV(data=USgrowth)
model4 = pf.GASLLEV(data=USgrowth, family=pf.GASt())
model5 = pf.GPNARX(data=USgrowth, ar=1, kernel=pf.SquaredExponential())
model6 = pf.GPNARX(data=USgrowth, ar=2, kernel=pf.SquaredExponential())
```

We set up the

**Aggregate**model with a squared loss penalty and add the models:

```
mix = pf.Aggregate(learning_rate=1.0, loss_type='squared')
mix.add_model(model1)
mix.add_model(model2)
mix.add_model(model3)
mix.add_model(model4)
mix.add_model(model5)
mix.add_model(model6)
```

To find a good learning rate we can use

**tune_learning_rate**. For this example, we are going to look at our model performance over 40 periods (10 years):

```
mix.tune_learning_rate(40)
mix.learning_rate
```

```
100000.0
```

Let’s run the mixture on the last 40 datapoints (quarters) of the data and view the evolution through

**plot_weights**:

```
mix.plot_weights(h=40, figsize=(15,5))
```

The best ensemble loss was achieved with a high learning rate, likely because a subset of models were harming the ensemble performance (a high learning rate kills these off). Pre financial crisis, the Gaussian local level model was a good performer. Following the crisis, the Student t local level model dominates.

We can get the summary results for the ensemble and the model constituents with

**summary**:

```
mix.summary(h=40)
```

Squared Loss | |
---|---|

Ensemble | 0.000044 |

ARIMA(4,0,0) | 0.000046 |

ARIMA(8,0,0) | 0.000049 |

LLEV | 0.000044 |

t GAS LLM | 0.000043 |

GPNARX(1) | 0.000059 |

GPNARX(2) | 0.000055 |

We can see that the GP models are not performing well, which is why a high learning rate hyperparameter was preferred. Let’s see what happens if we exclude these models.

```
mix2 = pf.Aggregate(learning_rate=1.0, loss_type='squared')
mix2.add_model(model1)
mix2.add_model(model2)
mix2.add_model(model3)
mix2.add_model(model4)
mix2.tune_learning_rate(h=40)
mix2.learning_rate
```

```
0.001
```

Now an extremely low learning rate is preferred that suggests an equal weights solution. Let’s plot the weights:

```
mix2.plot_weights(h=40, figsize=(15,5))
```

Let’s summarize again:

```
mix2.summary(h=40)
```

Squared Loss | |
---|---|

Ensemble | 0.000042 |

ARIMA(4,0,0) | 0.000046 |

ARIMA(8,0,0) | 0.000049 |

LLEV | 0.000044 |

t GAS LLM | 0.000043 |

As we can see, the ensemble performs better than any individual model. We can view what the in-sample ensemble predictions were with

**predict_is**:

```
mix2.predict_is(h=5)
```

0 | |
---|---|

2014-10-01 | 0.008900 |

2015-01-01 | 0.006043 |

2015-04-01 | 0.004974 |

2015-07-01 | 0.006713 |

2015-10-01 | 0.006267 |

We can predict out of sample with the ensemble using

**predict**:

```
mix2.predict(h=5)
```

0 | |
---|---|

2016-01-04 | 0.004939 |

2016-04-05 | 0.005529 |

2016-07-04 | 0.005337 |

2016-10-03 | 0.005396 |

2017-01-03 | 0.005649 |

If we want the latest weights in an array, we can use the

**run**method directly:

```
latest_weights, _, _ = mix2.run(h=40)
latest_weights = latest_weights[-1,]
print(latest_weights)
```

```
[ 0.24999999 0.24999996 0.25000001 0.25000003]
```