This chapter describes the learning algorithm for the HNFA+VM model. The most important part of it, the adjustments of posterior approximations of single variables, is already described in Chapter , but as we already saw with the simple example in Section , there are many local minima in which the learning can get stuck. This makes the initialisation and tricks to avoid those minimax important.
The model structure defined in Chapter is so rich that it can be used in various manners. It is possible that the sources on the uppermost layer define the reconstructions of the data quite exactly, that is, the neurons in middle layers are used like computational units. The second case would be that the an upper layer just activates the middle layer and the actual value is defined in the middle layer. The difference can be measured by separating the terms of the cost function to each layer. The first layer containing the observations can not be compared though, since it differs by not including the Cs,q term defined in ().
The initialisation in the first case should be quite different from the one used for the latter. If the reconstructions are defined already in the upper layer, it is unreasonable to initialise the model layer by layer. Instead, one could use the initialisation from [43]. The uppermost sources are initialised using PCA and the weights randomly. These sources are fixed for some period for the algorithm to find a meaningful representation of the data. It would be interesting to make such experiments with HNFA in the future.
The experiments in this thesis fall into the second category. The most important function of the upper layer is to activate some neurons in the middle layer and the actual reconstructions of the data are defined only in the middle layer. In this case, the weight matrices are initialised with vector quantisation or independent component analysis and fixed for some time to find meaningful values for sources.