Next: Decision theory Up: Bayesian probability theory Previous: The Bayes rule and Contents

Structure among unknown variables

For getting a model that is useful in new situations, i.e. having generalisation ability, some structure among the unknown variables $\boldsymbol{\Theta}$ needs to be assumed. A typical structure in machine learning is a division of unknown variables $\boldsymbol{\Theta}$ into parameters $\boldsymbol{\theta}$ and latent variables $\boldsymbol{S}$ , $\boldsymbol{\Theta}=(\boldsymbol{\theta},\boldsymbol{S})$ . The distinction is that parameters $\boldsymbol{\theta}$ are shared among data samples, but there are separate latent variables for each data sample. Thus, the number of latent variables grows linearly with data size while the number of parameters stays the same. The latent variables can be thought of as the internal state of a system. Sometimes computing the posterior distribution over the parameters $\boldsymbol{\theta}$ is called Bayesian learning, leaving the term Bayesian inference to only refer to computing the posterior distribution of latent variables $\boldsymbol{S}$ .

Graphical models, described in Chapter 3, provide a formalism for defining the exact structure of dependencies. The fundamental idea is that a complex system can be built by combining simpler parts. A graphical model is a graph whose nodes represent random variables and edges represent direct dependencies.

Next: Decision theory Up: Bayesian probability theory Previous: The Bayes rule and Contents

Tapani Raiko 2006-11-21