Maximisation step

Next: EXPERIMENTS Up: PARAMETER ESTIMATION Previous: Expectation step

Maximisation step

Assuming the prior distribution of the parameters $p(\boldsymbol{\theta}\mid\mathcal{H})$ to be formed from Dirichlet distributions, makes the maximisation step easy. Let $\rho(j)\in[0,1]$ , $j=1,\dots,k$ be random variables such that $\sum_{j=1}^k\rho(j)=1$ . In our case, $\rho(j)$ are the transition probabilities $p_{cl}$ corresponding to a certain body or the selection probabilities $\mu$ corresponding to a certain type. A Dirichlet distribution is defined as

$\displaystyle {P}(\rho)=\prod_j\rho(j)^{z(j)-1}/C,$

(10)

where

, the pseudocounts, are parameters controlling the distribution and

is a normalisation constant, which guarantees that $\int {P}(\rho)d\rho=1$ . Note that a Dirichlet distribution with $\forall j, z(j)=1$ is uniform.

The terms of log likelihood $\log p(\boldsymbol{X}\mid \boldsymbol{\theta}, \mathcal{H})$ that depend on $\rho(j)$ can be written in the form $\sum_j s(j) \log \rho(j)$ , where are the expected counts of how many times $\rho(j)$ was used. If we assume the counts to be constant and vary only the probabilities $\rho(j)$ , we can find the maximum of the a posteriori density analytically:

$\displaystyle \rho(i)=\frac{s(i)+z(i)-1}{\sum_j \left[s(j)+z(j)-1\right]}.$

(11)

The solution of the componentwise Bayes estimate [8]

$\displaystyle \rho(i)=\frac{s(i)+z(i)}{\sum_j \left[s(j)+z(j)\right]}$

(12)

resembles closely the map solution (11). The cB estimate is actually a map-estimate with a modified Dirichlet prior (the pseudocounts

increased by one), i.e. the same convergence results hold.

Next: EXPERIMENTS Up: PARAMETER ESTIMATION Previous: Expectation step

Tapani Raiko 2003-07-09