Selecting actions based on a state-space model instead of the observation directly has many benefits: Firstly, it is more resistant to noise Raiko05IJCNN because it implicitly involves filtering. Secondly, the observations (without history) do not always carry enough information about the system state. Thirdly, when nonlinear dynamics are modelled by a function approximator such as an multilayer perceptron (MLP) network, a state-space model can find such a representation of the state that it is more suitable for the approximation and thus more predictable.
Nonlinear dynamical factor analysis (NDFA) Valpola02NC is a powerful tool for system identification. It is based on a nonlinear state-space model learned in a variational Bayesian setting. NDFA scales only quadratically with the dimensionality of the observation space, so it is also suitable for modelling systems with fairly high dimensionality Valpola02NC.
In our model, the observation (or measurement) vector
is assumed to have been generated from the
hidden state vector
driven by the control
by the following generative model:
Multilayer perceptron (MLP) networks Haykin98 suit well to
modelling both strong and mild nonlinearities. The MLP network models
for
and
are
![]() |
![]() |
(3) |
![]() |
![]() |
(4) |
There are infinitely many models that can explain any given data. In
Bayesian learning, all the possible explanations are averaged
weighting by their posterior probability. The posterior probability
of the states and the parameters
after observing the data, contains all the relevant information about
them. Variational Bayesian learning is a way to approximate the
posterior density by a parametric distribution
. The misfit is measured by the
Kullback-Leibler divergence:
The approximation needs to be simple for mathematical tractability
and computational efficiency. Variables are assumed to depend of each
other in the following way:
Inference (or state estimation) happens by adjusting the values
corresponding to hidden states in such that the cost function
is minimised. Learning (or system identification) happens by
adjusting both the hidden states and the model parameters in
minimising
.
The same cost function can also be used for determining the model
structure, e.g. the dimensionality of the state space.
The NDFA package contains an
iterative minimisation algorithm for that. A good initialisation and
other measures are essential to avoid getting stuck into a bad local
minimum of the cost function. The standard initialisation for the
learning is based on principal component analysis of the data
augmented with embedding. Details can be found in Valpola02NC.