Next: Ensemble Learning Up: Building Blocks for Hierarchical Previous: Building Blocks for Hierarchical

Introduction

We report principles which have been found useful in designing and learning large factor-analysis-like latent variable models. The design is based on a small number of basic building blocks which can be flexibly combined. Three important issues arise within this design: 1) the need for a cost function which can be used for learning the model structure, 2) a learning method which avoids over-fitting and 3) the requirement of roughly linear computational complexity for scalability.

Ensemble learning [1] has proven to satisfy these requirements. Ensemble learning and related variational methods have been successfully applied to various extensions of linear Gaussian factor analysis. The extensions have included mixtures-of-Gaussian distributions for source signals [2], nonlinear units [3,4] and MLP networks to model nonlinear observation mappings [5] and nonlinear dynamics of the sources [6]. Ensemble learning has also been applied to large discrete models such as belief networks and hidden Markov models.

In this paper we discuss models which are build from addition and multiplication, Gaussian variables possibly followed by a nonlinearity, as well as discrete variables and switching units. Various model structures proposed in the literature can be build out of these elements and we also present some new model structures. They utilise Gaussian variables which model the variance of other Gaussian variables allowing the variance to have a hierarchical or dynamical model. Related model structures have been proposed for instance in [7,8,9,10,11,12] but with these methods it is difficult to learn the structure of the model or compare different model structures.

This paper is organised as follows. Section 2 gives a brief overview of ensemble learning. The building blocks are introduced in Section 3 and various models structures which utilise them are discussed in Section 4. Experiments with a hierarchical nonlinear model for means and variances are reported in Section 5.

Next: Ensemble Learning Up: Building Blocks for Hierarchical Previous: Building Blocks for Hierarchical

Harri Valpola 2001-10-01