Parametric approximation of the posterior pdf

Next: Ensemble learning Up: Bayesian learning Previous: Bayesian learning

Parametric approximation of the posterior pdf

In practice, exact treatment of the posterior pdf of the models is impossible and the posterior pdf needs to be approximated. The existing methods for doing this can be roughly divided into stochastic sampling and parametric approximation. Stochastic sampling typically yields better approximations but is also computationally much more expensive. We therefore opt for the computationally efficient parametric approximation which usually yields satisfactory results.

A standard approach for parametric approximation is the Laplace's method. One variation was introduced to the neural networks community by MacKay, who called his method the evidence framework: one first finds a (local) maximum point of the posterior pdf and then applies a second order Taylor's series approximation for the logarithm of the posterior pdf. This amounts to applying the Gaussian approximation to the posterior pdf.

Unfortunately, also Laplace's method can suffer from overlearning. Recall that too complex models have very high posterior probability densities. Therefore finding the maximum point of the posterior pdf focuses the search on too complex models. In the end, the second order Taylor's series approximation will reveal that the peak is narrow, but then it is already too late.

Next: Ensemble learning Up: Bayesian learning Previous: Bayesian learning

Harri Lappalainen
1999-05-25