The nonlinear factor analysis described in Section can in theory model any kind of probability density in the data space. If the hidden units or computational nodes of this MLP-like network are replaced by latent variables, one gets a hierarchical presentation, where the number of layers is not restricted. The learning or the adjustments can be done layer by layer in linear computational complexity. Between two adjacent layers the mapping can be quite simple and therefore easy to learn. Still, the total mapping through all the layers can be strongly nonlinear.
The visual cortex of mammals seems to have a hierarchical structure [27], which can perhaps be modelled with a similar hierarchical structure. Simple cells in the primary visual cortex (V1) show rectangular antagonistic on/off zones responding to bars of a particular orientation. Complex cells respond either to an edge, a bar or a slit stimulus of a particular orientation falling anywhere in its receptive field. The exact location of the stimulus within the receptive field is not as critical. The cells in the third category are called hypercomplex.
Dependencies between certain variances have been found [67] from image data. This could be taken into account in the model using variance neurons. Several higher-order statistical properties of natural images and signals can be explained by a stochastic model which simply varies scale of an otherwise stationary Gaussian process [55]. The independent components of images resemble features that simple cells respond to. In the independent subspace analysis, phase shift invariant features emerged [29], which corresponds to behaviour of complex cells. The difference to ICA is the correlation of variances of the features. Also there have been good results in using topographical ICA [30] on image data. It is also based on modelling correlations of variances of 'independent' components.