Hidden Markov models

Hidden Markov models (HMMs) (Rabiner and Juang, 1986) are dynamic Bayesian networks with a certain structure. Observed data are assumed to be generated as a function of an unobserved state which is evolving stochastically in time. The observations $ x(t)$

are conditioned on state $ s(t)$

. The discrete state $ s(t)$

at time

is a latent variable and the conditional probability for a state transition $P(s(t)\mid s(t-1))$ does not depend on $ t$

. Here we will use a non-standard formulation where the state transition is conditioned on the observation as well, that is, $P(s(t)\mid s(t-1),x(t-1))$ . It is fairly easy to see that the two approaches are equivalent^3.3 in representational capacity. Appendix B of Publication VII shows that the two approaches are equivalent in a setting generalised to first-order logic.

Figure 3.3 shows an example of a HMM. There are two bent coins in a bag, one gives more heads and the other more tails. A person draws a coin randomly from the bag and tosses it until it lays tails up. At that point it is put back to the bag and one of the coins is drawn and tossed again. At time 0, the system is always in the start state ( $ s(0)=$

start). Afterwards, it is always either in state coin1 or coin2 possibly going through an auxiliary state bag. The state being hidden (or latent) means that one does not know which coin is used, only the sequence of heads and tails is observed.

**Figure 3.3:** Graphical representations of a hidden Markov model. *Top left:* The corresponding Bayesian network. *Top right:* The transitions are represented with an arrow and annotated with transition probabilities and possible observations. *Bottom:* The trellis showing all the possible evolutions of states unrolled in time.
$\includegraphics[width=0.45\textwidth]{hmm_bn.eps}$ $\includegraphics[width=0.45\textwidth]{coin_hmm_2.eps}$ $\includegraphics[width=0.8\textwidth]{hmm_trellis.eps}$

The instance of belief propagation algorithm for HMMs is called the forward-backward algorithm. In the example, this would give the probabilities for which coin was drawn from the bag at each stage. It is also possible to learn the parameters from observations, say, estimate how badly the coins are bent. That can be done using the Baum-Welch algorithm, which is the instance of EM algorithm for HMMs.

There are several extensions of HMMs such as factorial HMMs by Ghahramani and Jordan (1997), relational Markov models by Anderson et al. (2002), and extension based on tree automata by Frasconi et al. (2002). Section 6.2 and Publication VII introduce an extension of hidden Markov models to first-order logic, which generalises all of the above.