Selecting actions based on a state-space model instead of the observation directly has many benefits: Firstly, it is more resistant to noise Raiko05IJCNN because it implicitly involves filtering. Secondly, the observations (without history) do not always carry enough information about the system state. Thirdly, when nonlinear dynamics are modelled by a function approximator such as an multilayer perceptron (MLP) network, a state-space model can find such a representation of the state that it is more suitable for the approximation and thus more predictable.
Nonlinear dynamical factor analysis (NDFA) Valpola02NC is a powerful tool for system identification. It is based on a nonlinear state-space model learned in a variational Bayesian setting. NDFA scales only quadratically with the dimensionality of the observation space, so it is also suitable for modelling systems with fairly high dimensionality Valpola02NC.
In our model, the observation (or measurement) vector is assumed to have been generated from the hidden state vector driven by the control by the following generative model:
Multilayer perceptron (MLP) networks Haykin98 suit well to modelling both strong and mild nonlinearities. The MLP network models for and are
(3) | ||
(4) |
There are infinitely many models that can explain any given data. In Bayesian learning, all the possible explanations are averaged weighting by their posterior probability. The posterior probability of the states and the parameters after observing the data, contains all the relevant information about them. Variational Bayesian learning is a way to approximate the posterior density by a parametric distribution . The misfit is measured by the Kullback-Leibler divergence:
The approximation needs to be simple for mathematical tractability and computational efficiency. Variables are assumed to depend of each other in the following way:
Inference (or state estimation) happens by adjusting the values corresponding to hidden states in such that the cost function is minimised. Learning (or system identification) happens by adjusting both the hidden states and the model parameters in minimising . The same cost function can also be used for determining the model structure, e.g. the dimensionality of the state space. The NDFA package contains an iterative minimisation algorithm for that. A good initialisation and other measures are essential to avoid getting stuck into a bad local minimum of the cost function. The standard initialisation for the learning is based on principal component analysis of the data augmented with embedding. Details can be found in Valpola02NC.