In Bayesian learning, the learning system updates its prior probability of models (explanations, states of the world, etc.) into posterior probability according to Bayes' rule. The updated probability can then be used for prediction or making decisions. In both cases, the computation involves a sum weighted by the posterior probability (or an integral in case of real valued parameters). The straight-forward numerical summation or integration is usually computationally far too expensive, and therefore various techniques have been developed for approximating the result.

Basically there are two complementing ways to reduce the required computation. One is to design the models so that the posterior probability will have a mathematically tractable, simple functional form. The other is to approximate the weighted sum or integral. The methods are complementary because the accuracy of the approximation depends on the complexity of the posterior probability which can be affected by the design of the model.

If it is known in advance which prediction or decision is going to be made based on the posterior probability, the approximation can and should take this into account. In many cases this information is not available, however, and then the best thing to do is to try to approximate those parts of the posterior probability which have the highest probability mass because those are the ones which have the strongest impact on the predictions and decisions.