next up previous contents
Next: Probability density for real Up: Bayesian Ensemble Learning for Previous: Summary: learning, reasoning and

   
BAYESIAN LEARNING IN PRACTICE

The previous section outlined how learning, reasoning and decision making should function in theory. The theory does not take into account the computational and storage capacity requirements, however. In realistic situations exact computation following Bayes' rule, the marginalisation principle and the rule of expected utility is almost always computationally prohibitive. Therefore an important research topic has long been the development of methods which yield practical approximations of the application of the exact theory.

Currently the frequentist interpretation is the prevailing statistical philosophy. According to this view, probability measures the relative frequencies of different outcomes in an (imaginary) infinite sequence of trials. Probability is defined for random variables which can take different values in different trials. Both the frequency and belief interpretations have coexisted since the early days of probability theory (see, e.g., [75]). The advent of quantum physics gave the frequentist view strong impetus because probability was seen as a measurable property of the physical world.

The Bayesian and frequentist schools use different language and methods. In frequentist statistics, a hypothesis or a parameter of a model cannot have probabilities as they are not random variables and do not take different values in different trials. The methods developed within the frequentist school include estimators and confidence intervals for parameters and P-values for hypothesis testing, whereas prior and posterior probabilities do not enter the calculations.

In Bayesian statistics, probability measures the degree of belief and it is therefore admissible to talk about the probabilities of hypotheses and parameters. The probability distribution of a parameter, for instance, quantifies the uncertainty of its value. Many Bayesian statisticians avoid talking about random variables altogether because almost always one is actually talking about uncertainty of the outcomes of experiments, not some intrinsic property of the world.

The prevailing frequentist language and methods have caused misconceptions about the Bayesian view and not all researchers are ready to consider the Bayesian approach to statistical inference as theoretically optimal. Here are answers to some common arguments against the Bayesian approach to learning:



 
next up previous contents
Next: Probability density for real Up: Bayesian Ensemble Learning for Previous: Summary: learning, reasoning and
Harri Valpola
2000-10-31