Expectation step

Next: Maximisation step Up: PARAMETER ESTIMATION Previous: PARAMETER ESTIMATION

Expectation step

The $\alpha$ and $\beta$ values computed by the forward-backward algorithm are useful for many purposes. Firstly, the probability of the observation sequence given the LOHMM is simply $p_{seq}=\sum_{s \in S_T}\alpha_T(s)=\beta_0(start)$ . Secondly, the probability of being in a state at time given the observation sequence is $\alpha_t(s)\beta_t(s)/p_{seq}$ . Thirdly, computing the expected counts required for the maximization step, becomes simple.

The forward-backward algorithm is based on dynamic programming. To ease the computations, we find sets in the beginning of learning. The set is defined as the set of states at time that are reachable from start and are not conflicting with the observation sequence $o_1,o_2,\dots,o_T$ . The reachable states can be found in two phases:

Collect: Set $S_0=\{$ start $\}$ . For each $t=0,1,\dots,T$ , using the states $S_{t}$ as sources, find all states $S_{t+1}$ where a transition can be made without conflict with the current observation .
Prune: For each $t=T,T-1,\dots,0$ , remove state from if there are only conflicting transitions to any of the states still in $S_{t+1}$ .

The

state is pruned if and only if the LOHMM could not have produced the sequence.

Probabilities $\alpha$ and $\beta$ are computed for each state in each recursively. The $\alpha_t(s)$ is defined as the probability of the partial observation sequence $o_1,\dots,o_{t-1}$ and state at time given the LOHMM. The $\beta_t(s)$ is the probability of the partial observation sequence $o_t,\dots,o_T$ given state at time and the LOHMM. Set $\alpha_0(start)=1.0$ and $\beta_{T+1}(s)=1.0$ for every $s\in S_{T+1}$ . Recursion formulae are

$\displaystyle \alpha_t(h)=\sum_{cl\in \mathcal{H}}\sum_{b\in S_{t-1}} \alpha_{t-1}(b)p_{cl}\mu\delta(cl,b,h,o_t)$			(6)
$\displaystyle \beta_t(b)=\sum_{cl\in \mathcal{H}}\sum_{h\in S_{t+1}} \beta_{t+1}(h)p_{cl}\mu\delta(cl,b,h,o_t),$			(7)

where $p_{cl}$ is the transition probability and $\mu$ is the selection probability of the instantiation of the new variables in the head and observation. The indicator function $\delta(cl,b,h,o_t)=1$ whenever transition

can take from state

observing

and the transition

has the most specific body for

The expected counts for the maximisation step are computed using $\alpha$ and $\beta$ values. Let $\nu(cl,b,h,t)$ be the probability of going from state at time to state at time using the transition given the LOHMM and the observation sequence:

$\displaystyle \nu(cl,b,h,t)=\frac{\alpha_t(b)p_{cl}\mu\beta_{t+1}(h)\delta(cl,b,h,o_t)}{p_{seq}}.$

(8)

The expected count

of using transition

with the data set $\boldsymbol{X}$ is

$\displaystyle s(cl)=\sum_{seq\in \boldsymbol{X}}\sum_{t=0}^{T_{seq}}\sum_{b\in S_t}\sum_{h\in S_{t+1}}\nu(cl,b,h,t),$

(9)

where $T_{seq}$ is the length of the sequence

. The expected counts of the selection distribution $\mu$ are also obtained by summing terms $\nu$ analogously.

Next: Maximisation step Up: PARAMETER ESTIMATION Previous: PARAMETER ESTIMATION

Tapani Raiko 2003-07-09