next up previous contents
Next: Problem Settings Up: Introduction Previous: Introduction

Aim of the Thesis

The aim of this thesis is to build a system that is able to model high-dimensional static data such as image patches. The model could then be used for noise reduction, for reconstruction of missing values and thus to supervised learning tasks [57], for predicting future observations or as a part of an autonomous intelligent system. These practical applications are briefly discussed in Chapter [*].

Generative models are promising approaches to unsupervised learning tasks. A factor-analysis like latent variable model called hierarchical nonlinear factor analysis with variance modelling (HNFA+VM) is used in this thesis. It belongs to generative models. As the structure and the complexity of the model is highly adjustable, there is a need for a cost function that can be used for learning the model structure and balancing between over- and under-fitting. As the dimensionality of the data is high, the computational complexity is also a very important issue.

Ensemble learning [25] provides such a cost function. It can be used so that the complexity scales linearly with respect to the size of the system. Ensemble learning and related variational methods have been successfully applied to various extensions of linear Gaussian factor analysis. The extensions have included mixtures-of-Gaussian distributions for source signals [2,10,49], nonlinear units [15,50] and MLP networks to model nonlinear observation mappings [43] and nonlinear dynamics of the sources [63]. Ensemble learning has also been applied to large discrete models such as belief networks [51] and hidden Markov models [48].

In this thesis, a model is presented which is built from addition, multiplication and Gaussian variables possibly followed by a nonlinearity. The Gaussian variables also model the variance of other Gaussian variables allowing also the variance to have a hierarchical model. Related model structures have been proposed for instance in [40,8,56,19,30,29] but with these methods it is difficult to learn the structure of the model or compare different model structures.


next up previous contents
Next: Problem Settings Up: Introduction Previous: Introduction
Tapani Raiko
2001-12-10