Segmental LVQ3 training

Next: Relations to other corrective Up: LVQ for MDHMMs Previous: Corrective tuning based on

Segmental LVQ3 training

As apparent from the previous section, despite the good average results obtained by careful supervision, the corrective tuning may sometimes behave in undesired ways, either if incorrect adjustment step size, insufficient initialization or excessive discrimination is applied. To overcome the possible instability of the corrective LVQ2 tuning and to simplify the HMM training by making the additional tuning unnecessary, the segmental LVQ3 is presented as a substitute for separate successive maximum likelihood and corrective training phases. The idea is to apply a simple discriminative training method that would be robust enough to accept directly a quick HMM initialization, for example, based on SOM (see Section 3.2.2) so that the conventional ML training can be completely eliminated from the training scheme (see Figure 3). The new simple discriminative training should simultaneously be able to produce acceptable error rates in just a few training epochs without the need of a separate fine tuning phase.

Briefly, the segmental LVQ3 is a rather straightforward combination of the segmental K-means and the corrective training. Depending on the input the algorithm operates in either of two alternative options. For correctly recognized phonemes, the conventional likelihood maximization mode is used, but if the recognition fails, the likelihood of the correct phoneme is improved and the likelihood of the incorrect rival is lowered to increase the discrimination ability of the model. The conventional mode improves both the stability of the learning and the robustness against the initial values, because it prevents any excessive parameter penalization and directs the discrimination to the cases where it is really necessary.

The adjustments are made in batches consisting of the whole training set, similarly as in the segmental K-means. This removes the problem of the determination of the suitable learning rate schedule to properly take into account the changing segmentation and also improves the convergence speed. It is to be expected, however, that the result obtained in batch training may be somewhat more dependent on the initialization as that in stochastic training, because it may more easily get stuck to local minima. To overcome such possible convergence problems, the so-called Wegstein modification of the parameter adjustments could be applied in the same way as suggested for batch version of SOM [Kohonen, 1995]. Since the segmental LVQ3 training method has provided good experimental results (see Publications 4 and 6), it seems to suit well for the segmental training as such, anyhow.

The exact adjustment laws of the segmental LVQ3 are presented in Publication 4. Publication 6 describes the same algorithm, but with a slightly revised notation.

Next: Relations to other corrective Up: LVQ for MDHMMs Previous: Corrective tuning based on

Mikko Kurimo
11/7/1997