next up previous contents
Next: The results Up: Comparison with other models Previous: Comparison with other models   Contents

The experimental setting

All the models used the same preprocessed data set as in Section 7.1.2. The individual words were processed separately in the preprocessing and all the dynamical models were instructed to treat each word individually, i.e. not to make predictions across word boundaries.

The models used in the comparison were:

The parameters of the HMM priors for the initial distribution $ u^{(\pi)}$ and the transition matrix $ u^{(A)}$ were all set to ones. This corresponds to a flat, noninformative prior. The choice does not affect the performance of the switching NSSM very much. The HMM, on the other hand, is very sensitive to the prior.

The data used with the plain HMM was additionally decorrelated with principal component analysis (PCA) [27]. This improved the performance a lot compared to the situation without the decorrelation, as the prototype Gaussians were restricted to be uncorrelated. The other algorithms can include the same transformation to the output mapping so it was not necessary to do it by hand. Using the nondecorrelated data has the advantage that it is ``human readable'' whereas the decorrelated data is much more difficult to interpret.


next up previous contents
Next: The results Up: Comparison with other models Previous: Comparison with other models   Contents
Antti Honkela 2001-05-30