Next: Output density models.
Up: Description of the model
Previous: Assumptions and definitions.
The joint probability of the state sequence q
being generated by the HMM and
the observation sequence
being generated by that state sequence is
| |
(16) |
Since the generating state sequence is unknown
the actual probability of the observation sequence
for the model is
| |
(17) |
The probability (18) can be computed
using the forward-backward procedure
[Baum, 1972].
The dynamic programming by Viterbi algorithm
[Forney, 1973]
is commonly used to decode the most likely state sequence behind
the observations by maximizing recursively
the probability (17).
The estimation of the HMM parameters
using the maximum likelihood (ML) criterion,
i.e. maximization of over ,is done using the Baum-Welch algorithm
[Baum and Petrie, 1966].
Anyhow, a simpler ML training can be obtained by replacing
the maximization of by the
maximization of the likelihood of the most probable state sequence
obtained by the Viterbi search.
The optimal model is then
| |
(18) |
This latter method is called the segmental K-means or
the Viterbi training and it can be shown
[Rabiner et al., 1986]
to have the same asymptotic behavior
as the Baum-Welch training, but with less numerical difficulties.
Next: Output density models.
Up: Description of the model
Previous: Assumptions and definitions.
Mikko Kurimo
11/7/1997