Ensemble learning offers another important benefit. Comparison of
different models is straightforward. The Bayes rule can be applied
again to get the probability of a model given the data
)
Multiple models can be used as a mixture of experts model
[24]. The experts can be weighted with their probabilities
given in equation (
). If the models have equal
prior probabilities and the parameter approximations
CKL are
equally good, the weights simplify to
.
In practice, the
costs tend to differ in the order of hundreds or thousands, which
makes the model with the lowest cost C dominant. Therefore it is
reasonable to concentrate on model selection rather than weighting.