Theory

Let’s say we want to approximate an integral of the form:

Y = \int\exp\left(f\left(\theta\right)\right)d\theta


If we perform a second-order Taylor approximation of f\left(\theta\right) around its maximum \theta^{*}:

f\left(\theta\right) =  f\left(\theta^{*}\right) + \left(\theta-\theta^{*}\right)^{T}f'\left(\theta^{*}\right) + \frac{1}{2}\left(\theta-\theta^{*}\right)^{T}f''\left(\theta^{*}\right)\left(\theta-\theta^{*}\right)


Since f'\left(\theta^{*}\right)=0, we can write the integral as:

Y \approx \int\exp\left( f\left(\theta^{*}\right) + \frac{1}{2}\left(\theta-\theta^{*}\right)^{T}f''\left(\theta^{*}\right)\left(\theta-\theta^{*}\right) \right)d\theta


Y \approx \exp\left( f\left(\theta^{*}\right)\right)\int\exp\left(\frac{1}{2}\left(\theta-\theta^{*}\right)^{T}f''\left(\theta^{*}\right)\left(\theta-\theta^{*}\right) \right)d\theta


Y \approx \exp\left( f\left(\theta^{*}\right)\right)2\pi^{K/2}\left|-f''\left(\theta^{*}\right)\right|^{-1/2}


Now suppose f\left(\theta\right) = \log p\left(y\mid{\theta}\right) +  \log p\left({\theta}\right). Then Y is the normalizing constant and the modal approximation can be written as:

q\left(\theta \mid y \right) \sim N(\theta^{*},f''\left(\theta^{*}\right)^{-1})


We can find the point estimates and the Inverse Hessian using Penalized Maximum Likelihood estimation.


PyFlux

We will use the same data from the Penalized Maximum Likelihood example:

import numpy as np
import pyflux as pf
from pandas.io.data import DataReader
from datetime import datetime
import matplotlib.pyplot as plt
%matplotlib inline 

fb = DataReader('FB',  'yahoo', datetime(2000,1,1), datetime(2016,3,10))
fb['Logged Adj Close'] = np.log(fb['Adj Close'].values)
plt.figure(figsize=(15,5))
plt.plot(fb.index[1:len(fb.index)],np.diff(fb['Logged Adj Close'].values))
plt.ylabel('Returns')
plt.title('Facebook Returns')
plt.show()
png

To perform a Laplace approximation, we use the ‘Laplace’ fit option. For example, for an ARIMA(1,1,1) model:

model = pf.ARIMA(data=fb,ar=1,ma=1,integ=1)
x = model.fit("Laplace")
x.summary()
ARIMA(1,1,1)                                                                                              
======================================================= =================================================
Dependent Variable: Differenced Open                    Method: Laplace                                   
Start Date: 2012-05-21 00:00:00                         Unnormalized Log Posterior: -1744.307             
End Date: 2016-03-10 00:00:00                           AIC: 3496.61407324                                
Number of observations: 956                             BIC: 3516.0651049                                 
=========================================================================================================
Latent Variable                          Median             Mean               95% Credibility Interval 
======================================== ================== ================== ==========================
Constant                                 0.0349             0.035              (0.0014 | 0.0684)        
AR(1)                                    0.5762             0.576              (0.3314 | 0.8204)        
MA(1)                                    -0.6477            -0.6473            (-0.9009 | -0.3934)      
Sigma                                    1.4983             1.4986             (1.443 | 1.5569)         
=========================================================================================================

The latent variable plots through plot_z simply use the point estimates and curvature from the inverse Hessian to describe the uncertainty:

model.plot_z(indices=[0,1,2])
png