next up previous
Next: LEARNING ALGORITHMS Up: Natural Conjugate Gradient in Previous: Natural Conjugate Gradient

VARIATIONAL BAYES AND NONLINEAR STATE-SPACE MODELS

Variational Bayesian learning is based on approximating the posterior distribution $ p(\boldsymbol{\theta}\vert \boldsymbol{X})$ with a tractable approximation $ q(\boldsymbol{\theta}\vert \boldsymbol{\xi})$, where $ \boldsymbol{X}$ is the data, $ \boldsymbol{\theta}$ are the unknown variables (including both the parameters of the model and the latent variables), and $ \boldsymbol{\xi}$ are the (variational) parameters of the approximation. The approximation is fitted by maximizing a lower bound on marginal log-likelihood

$\displaystyle \mathcal{B}(q(\boldsymbol{\theta}\vert \boldsymbol{\xi}))$ $\displaystyle = \left\langle \log \frac{p(\boldsymbol{X}, \boldsymbol{\theta})}{q(\boldsymbol{\theta}\vert \boldsymbol{\xi})} \right\rangle$ (22)
  $\displaystyle = \log p(\boldsymbol{X}) - D_$KL$\displaystyle (q(\boldsymbol{\theta}\vert \boldsymbol{\xi}) \Vert p(\boldsymbol{\theta}\vert \boldsymbol{X})),$    

where $ \langle \cdot \rangle$ denotes expectation over $ q$. This is equivalent to minimizing the Kullback-Leibler divergence $ D_$KL$ (q \Vert p)$ between $ q$ and $ p$ (Ghahramani and Beal, 2001).



Subsections

Tapani Raiko 2007-04-18