next up previous
Next: EXPERIMENTS Up: PARAMETER ESTIMATION Previous: Expectation step

Maximisation step

Assuming the prior distribution of the parameters $ p(\boldsymbol{\theta}\mid\mathcal{H})$ to be formed from Dirichlet distributions, makes the maximisation step easy. Let $ \rho(j)\in[0,1]$, $ j=1,\dots,k$ be random variables such that $ \sum_{j=1}^k\rho(j)=1$. In our case, $ \rho(j)$ are the transition probabilities $ p_{cl}$ corresponding to a certain body or the selection probabilities $ \mu$ corresponding to a certain type. A Dirichlet distribution is defined as

$\displaystyle {P}(\rho)=\prod_j\rho(j)^{z(j)-1}/C,$ (10)

where $ z(j)$, the pseudocounts, are parameters controlling the distribution and $ C$ is a normalisation constant, which guarantees that $ \int {P}(\rho)d\rho=1$. Note that a Dirichlet distribution with $ \forall j, z(j)=1$ is uniform.

The terms of log likelihood $ \log p(\boldsymbol{X}\mid \boldsymbol{\theta}, \mathcal{H})$ that depend on $ \rho(j)$ can be written in the form $ \sum_j s(j) \log \rho(j)$, where $ s(j)$ are the expected counts of how many times $ \rho(j)$ was used. If we assume the counts $ s(j)$ to be constant and vary only the probabilities $ \rho(j)$, we can find the maximum of the a posteriori density analytically:

$\displaystyle \rho(i)=\frac{s(i)+z(i)-1}{\sum_j \left[s(j)+z(j)-1\right]}.$ (11)

The solution of the componentwise Bayes estimate [8]

$\displaystyle \rho(i)=\frac{s(i)+z(i)}{\sum_j \left[s(j)+z(j)\right]}$ (12)

resembles closely the map solution (11). The cB estimate is actually a map-estimate with a modified Dirichlet prior (the pseudocounts $ z(j)$ increased by one), i.e. the same convergence results hold.


next up previous
Next: EXPERIMENTS Up: PARAMETER ESTIMATION Previous: Expectation step
Tapani Raiko 2003-07-09