Comparison of Estimators

$\displaystyle c(1) \stackrel{0.5}{\longleftarrow} start.$	$\displaystyle c(1) \stackrel{0.9:h}{\longleftarrow} c(1).$	$\displaystyle c(2) \stackrel{0.1:h}{\longleftarrow} c(2).$
$\displaystyle c(2) \stackrel{0.5}{\longleftarrow} start.$	$\displaystyle c(1) \stackrel{0.1:t}{\longleftarrow} c(1).$	$\displaystyle c(2) \stackrel{0.9:t}{\longleftarrow} c(2).$

Given sequences

and

as data, the structure shown above and a uniform prior over parameters, we now ask, what would the different estimators give as parameters. In fact, all of them would give 0.5 for the selection between the coins, but the probabilities

and

for the coins

and

to produce heads are of more interest.

There are three fixed points for the maximum a posteriori (or maximum likelihood) estimator (Eq. 3). The first one is a saddle point at

and the two others are the global maxima

and

. Using random initialisation, the Baum-Welch algorithm would end up in the latter two with equal probabilities and to the first one with probability 0. This estimator would conclude from the data that one of the coins produces heads every time and the other only tails. If the estimated model is tested with a sequence

, it would give a likelihood of exactly 0. From this failure one could conclude, that maximum likelihood estimator does not prepare well for new data if there is a limited amount of data available for learning.

The Bayes estimator (Eq. 4) is hard to evaluate in general, but in this case one can use symmetry to conclude that it is

. Since the estimator is always unique, it cannot decide which coin produces more heads (resp. tails).

The componentwise Bayes estimator (Eq. 5) has also three fixed points. The first one is the saddle point at

in analogy with the map-estimator. The stable points are now at

and

. Again, random initialisation will decide which one is chosen. The cB estimator seems to combine the good properties of the other two. It can operate with limited amount of data like the Bayes estimator, but can avoid the symmetrical solution.