next up previous
Next: Total Derivatives Up: Nonlinear State-Space Models Previous: Inference Methods

Variational Bayesian method

Nonlinear dynamical factor analysis (NDFA) [1] is a variational Bayesian method for learning nonlinear state-space models. The mappings $ \mathbf{f}$ and $ \mathbf{g}$ in Eqs. (1) and (2) are modelled with multilayer perceptron (MLP) networks whose parameters can be learned from the data. The parameter vector $ \boldsymbol{\theta}$ include network weigths, noise levels, and hierarchical priors for them. The posterior distribution over the sources $ \mathbf{S}=\left[\mathbf{s}(1),\dots,\mathbf{s}(T)\right]$ and the parameters $ \boldsymbol{\theta}$ is approximated by a Gaussian distribution $ q(\mathbf{S},\boldsymbol{\theta})$ with some further independency assumptions. Both learning and inference are based on minimising a cost function $ {\cal C}_{\mathrm{KL}}$

$\displaystyle {\cal C}_{\mathrm{KL}}= \int_{\boldsymbol{\theta}}\int_{\mathbf{S...
...{p(\mathbf{X},\mathbf{S},\boldsymbol{\theta})}d\mathbf{S} d\boldsymbol{\theta},$ (5)

where $ p(\mathbf{X},\mathbf{S},\boldsymbol{\theta})$ is the joint probability density over the data $ \mathbf{X}=\left[\mathbf{x}(1),\dots,\mathbf{x}(T)\right]$, sources $ \mathbf{S}$, and parameters $ \boldsymbol{\theta}$. The cost function is based on Kullback-Leibler divergence between the approximation and the true posterior. It can be split into terms, which helps in studying only a part of the model at a time. The variational approach is less prone to overfitting compared to maximum a posteriori estimates and still fast compared to Monte Carlo methods. See [1] for details.

The variational Bayesian inference algorithm in [1] uses the gradient of the cost function w.r.t. state in a heuristic manner. We propose an algorithm that differs from it in three ways. Firstly, the heuristic updates are replaced by a standard conjugate gradient algorithm [11]. Secondly, the linearisation method from [7] is applied. Thirdly, the gradient is replaced by a vector of approximated total derivatives, as described in the following section.


next up previous
Next: Total Derivatives Up: Nonlinear State-Space Models Previous: Inference Methods
Tapani Raiko 2005-12-08