No post-processing of the results is applied in order to extract all the differences produced by the compared models. The post-processing can exploit language dependent syntax and point out uncommon phoneme combinations from the raw phoneme sequences. An optional post-processing module in the applied ASR system is based on the Dynamically Expanding Context algorithm (DEC) [Kohonen, 1986a].The recognition of long phoneme versions like /AA/ from their short counterparts is a source of some frequent errors, as well. Here, the distinction is made using phoneme dependent duration limits learned iteratively during the model training. This simple separation does not take any context information into account. In Finnish the mismatches between the written and spoken format of words are quite exceptional, but these errors as well as some unmodeled rare phonemes increase the lowest obtainable value for the error rate.
Although the recognition test settings look similar in all publications included in the thesis, the error rate comparisons between the publications need to be done with special care. The speech database was revised in 1995 and the data collected after that is mainly used in the experiments, because there is now a broader variety of speakers available ranging from ASR researchers to novices. In some of the experiments the leave-one-out principle to average the results was abandoned in order to be able to test the methods with a larger amount of speakers. The set of training words was extended, as well, from the previous 311 to 350 by including some more uncommon words for better balance among the phoneme combinations. Since other more independent high quality Finnish databases suitable for similar experiments have unfortunately not been available, the data from 1991, which apply slightly different sampling rate than the new data, is still used to verify some of the main results.