Next: LEARNING ALGORITHMS Up: Natural Conjugate Gradient in Previous: Natural Conjugate Gradient

VARIATIONAL BAYES AND NONLINEAR STATE-SPACE MODELS

Variational Bayesian learning is based on approximating the posterior distribution $p(\boldsymbol{\theta}\vert \boldsymbol{X})$ with a tractable approximation $q(\boldsymbol{\theta}\vert \boldsymbol{\xi})$ , where $\boldsymbol{X}$ is the data, $\boldsymbol{\theta}$ are the unknown variables (including both the parameters of the model and the latent variables), and $\boldsymbol{\xi}$ are the (variational) parameters of the approximation. The approximation is fitted by maximizing a lower bound on marginal log-likelihood

$\displaystyle \mathcal{B}(q(\boldsymbol{\theta}\vert \boldsymbol{\xi}))$	$\displaystyle = \left\langle \log \frac{p(\boldsymbol{X}, \boldsymbol{\theta})}{q(\boldsymbol{\theta}\vert \boldsymbol{\xi})} \right\rangle$	(22)
	$\displaystyle = \log p(\boldsymbol{X}) - D_$ KL $\displaystyle (q(\boldsymbol{\theta}\vert \boldsymbol{\xi}) \Vert p(\boldsymbol{\theta}\vert \boldsymbol{X})),$

where $\langle \cdot \rangle$ denotes expectation over

. This is equivalent to minimizing the Kullback-Leibler divergence

KL $(q \Vert p)$ between

and

(Ghahramani and Beal, 2001).

Subsections

Tapani Raiko 2007-04-18