Antti Honkela
This thesis describes the development of a switching NSSM and a learning algorithm for its parameters. The learning algorithm is based on Bayesian ensemble learning, in which the true posterior distribution is approximated with a tractable ensemble. The approximation is fitted to the probability mass to avoid overlearning. The implementation is based on an earlier NSSM by Dr. Harri Valpola. It uses multilayer perceptron networks to model the nonlinear functions of the NSSM.
The computational complexity of the NSSM algorithm sets serious limitations for the switching model. Only one dynamical model can be used. Hence, the HMM is only used to model the prediction errors of the NSSM. This approach is computationally efficient but makes little use of the HMM.
The algorithm is tested with real-world speech data. The switching NSSM is found to be better in modelling the data than other standard models. It is also demonstrated how the algorithm can find a reasonable segmentation to different phonemes when only the correct sequence of phonemes is known in advance.