   Next: Predictable factors and state-space Up: Unsupervised Learning of Nonlinear Previous: Unsupervised Learning of Nonlinear

# Introduction

This technical report describes how the nonlinear factor analysis (NLFA) algorithm  can be extended by modelling the dynamics of the factors. The goal is to find factors which not only represent the observations compactly but are also predictable. Only those parts which differ from the NLFA algorithm are explained and this report should therefore be read together with .

Like in NLFA, learning is unsupervised and is based on uncovering regularities in the observations. NLFA can capture static regularities within observation vectors x(t) but discards any temporal structure present in the sequence of observations. Including a model of the dynamics of the factors results in a nonlinear dynamic factor analysis (NDFA) model which can capture both static and temporal structure of the observations.

The generative model for the observations x(t) is as follows:

 x(t) = f(s(t)) + n(t) (1)

 s(t) = g(s(t-1)) + m(t) (2)

The observations x(t) are assumed to have been generated by the factors s(t) through a nonlinear mapping f. It is reasonable to assume that the model misses some of the factors affecting the observations and that the nonlinear mapping is somewhat inaccurate. The error caused by these imperfections is modelled by i.i.d. Gaussian noise n(t). The nonlinear mapping fis modelled by a multi-layer perceptron (MLP) network.

The model for the dynamics of the factors has almost the same structure as the observation model. The observations are assumed to have been generated by the factors at the previous time instant. This means that the factors can be interpreted as the states of a dynamical system. The noise m(t) of the dynamic model is often called process noise or innovation process. Similar models have been proposed in [2,7] where the nonlinear mappings are modelled by radial basis functions  and in  where the nonlinearities are modelled by MLP networks.

As the dynamic model (2) has the same functional form as observation model (1), similar sets of hyperparameters can be used for both models. Learning could also be achieved with the same algorithm which was used for NLFA in . Some minor changes are made, however, which take into account the fact that the dynamic mapping g can usually be expected to be closer to identity mapping than zero and the model of the dynamics induces posterior correlations to the factors.

Section 2 discusses the properties of the state-space model used in this report and some of its alternatives. Section 3 introduces the modifications to the learning algorithm. Results of simulations are reported in section 4.   Next: Predictable factors and state-space Up: Unsupervised Learning of Nonlinear Previous: Unsupervised Learning of Nonlinear
Harri Valpola
2000-10-17