Next: Variance Neurons Up: Hierarchical Nonlinear Factor Analysis Previous: Form of the Cost

Hierarchical Nonlinear Factor Analysis with Variance Modelling

This chapter describes a model called hierarchical nonlinear factor analysis with variance modelling (HNFA+VM). It is built hierarchically from the building blocks described in Chapter . In addition to analysing nonlinear factors it can model variance dependencies of them. HNFA+VM is thus related to e.g. topographical ICA [30] that has been applied to analysing natural images with success.

The building blocks can be connected together rather freely but there are the following restrictions:

1.: The resulting network has to be a directed acyclic graph so that the probability distributions would be normalisable [51].
2.: Nonlinearity is always immediately after a Gaussian latent variable since the expectations are solved analytically only with a Gaussian input.
3.: Outputs of multiplication or nonlinearity cannot propagate to a variance prior because the expected exponential cannot be evaluated.
4.: There should be only one computational path from a latent variable to a variable. This assumption is used e.g. in (), (), (), () and (). If there are multiple paths, ensemble learning becomes more complicated [43] and the situation is out of the scope of this thesis.

The model introduced in this chapter has no model for dynamics. The time dependent variables are connected only to those time dependent variables that have the same time index. This means that dependencies in time are ignored and nothing would change, if the time indices were permutated. Time dependencies are an important direction for future work as will be discussed in Chapter .

Section described nonlinear factor analysis (NFA) [43,64]. The basic idea behind HNFA is to replace the deterministic hidden units or computational units of an MLP-like network in NFA by stochastic latent variables. The computational complexity of NFA is quadratic w.r.t. the number of nodes due to the different paths between sources and observations. In case of HNFA, the dependencies are broken off at the latent variables of the middle layer. This results in a linear computational complexity which is important for scalability. Different paths are now considered independent which means that the posterior approximation is less accurate. Not taking the dependencies into account increases the cost. The noise model in the hidden nodes in HNFA means that the reconstructions of the data do not have to be decided precisely at the uppermost layer in HNFA. Instead, the uppermost layers can just guide the lower ones which decide the actual reconstructions.

Next: Variance Neurons Up: Hierarchical Nonlinear Factor Analysis Previous: Form of the Cost

Tapani Raiko
2001-12-10