We can now summarise what Bayesian probability theory and decision theory say about learning, reasoning and action by giving a simple example. Suppose there are prior assumptions and experience I and possible explanations expressed as states of the world Si. An observation D is made and an action Aj is chosen based on the belief about what is the consequence D' of the action. We assume D' is one of several possible observations D'k made after the action is chosen.
The prior assumptions and experience I are assumed to be such that it is possible to determine the prior probability P(Si | I) of each state of the world; the probability P(D | Si I) of observation Dgiven the state of the world Si; the probabilities P(D'k | Si Aj D I) of different consequences of actions given the state of the world and prior experience; and the utility of the consequences U(Aj D'k D I). The action Aj is assumed to have no effect on the state Si of the world and thus P(Si | Aj D I) = P(Si | D I).
The first stage of the example is learning. First the states of the world have prior probabilities P(Si | I). After making the observation D, the probabilities change according to Bayes' rule:
(6) |
The belief in those states of the world which were able to predict the observation better than average increases, and vice versa.
The next stage is to infer which consequences different actions
have. According to the marginalisation principle,
(7) |
Notice that Aj was assumed to have no effect on Si and thus P(Si | Aj D I) is equal to the posterior probability P(Si | D I) which was computed in the first stage.
The third stage of the example is choosing an action which has the
greatest utility. The utilities can be computed by the rule of
expected utility:
(8) |
The utilities of actions are based on the utilities of consequences and the probabilities of consequences in light of the experience, which were computed in the previous stage.
So far we have explicitly denoted that the probabilities are conditional to the prior assumptions and experience I. In most cases the context will make it clear which are the prior assumptions and usually I is left out. This means that probability statements like P(Si) should be understood to mean P(Si | I) where Idenotes the assumptions appropriate for the context.