** Next:** Model for the measurements
** Up:** Ensemble learning
** Previous:** Ensemble learning

An important special case of approximation of the posterior pdf is the
model selection. The posterior pdf is typically multimodal, but often
almost all of the probability mass is located around the largest peak
of the posterior pdf. When there is a lot of data compared to the
complexity of the models, this is almost always the case. In our
case, approximating the posterior pdf with only one peak is usually
reasonably accurate.
Notice that the posterior density itself has no special meaning
regarding the averaging over models; only the probability mass
matters. A broad peak with low density can be more important than a
sharp peak with high density. Over-learning results in high but very
narrow peaks. The Kullback-Leibler information automatically takes
into account the probability mass and is therefore robust against
over-learning.

*Harri Lappalainen*

*7/10/1998*