next up previous
Next: Gaussian node Up: Building Blocks for Variational Previous: Variational Bayesian learning


Node types

In this section, we present different types of nodes that can be easily combined together. Variational Bayesian inference algorithm for the nodes is then discussed in Section 4.

Figure 2: First subfigure from the left: The circle represents a Gaussian node corresponding to the latent variable $ s$ conditioned by mean $ m$ and variance $ \exp(-v)$. Second subfigure: Addition and multiplication nodes are used to form an affine mapping from $ s$ to $ As+a$. Third subfigure: A nonlinearity $ f$ is applied immediately after a Gaussian variable. The rightmost subfigure: Delay operator delays a time-dependent signal by one time unit.
\begin{figure}\begin{center}
\epsfig{file=blocks.eps,width=11cm}
\end{center}
\end{figure}

In general, the building blocks can be divided into variable nodes, computation nodes, and constants. Each variable node corresponds to a random variable, and it can be either observed or hidden. In this paper we present only one type of variable node, the Gaussian node, but others can be used in the same framework. The computation nodes are the addition node, the multiplication node, a nonlinearity, and the delay node.

In the following, we shall refer to the inputs and outputs of the nodes. For a variable node, its inputs are the parameters of the conditional distribution of the variable represented by that node, and its output is the value of the variable. For computation nodes, the output is a fixed function of the inputs. The symbols used for various nodes are shown in Figure 2. Addition and multiplication nodes are not included, since they are typically combined to represent the effect of a linear transformation, which has a symbol of its own. An output signal of a node can be used as input by zero or more nodes that are called the children of that node. Constants are the only nodes that do not have inputs. The output is a fixed value determined at creation of the node.

Nodes are often structured in vectors or matrices. Assume for example that we have a data matrix $ {\mathbf{X}}=\left[{\mathbf{x}}(1),
{\mathbf{x}}(2),\dots ,{\mathbf{x}}(T)\right]$, where $ t=1,2,\dots T$ is called the time index of an $ n$-dimensional observation vector. Note that $ t$ does not have to correspond to time in the real world, e.g. different $ t$ could point to different people. In the implementation, the nodes are either vectors so that the values indexed by $ t$ (e.g. observations) or scalars so that the values are constants w.r.t. $ t$ (e.g. weights). The data $ {\mathbf{X}}$ would be represented with $ n$ vector nodes. A scalar node can be a parent of a vector node, but not a child of a vector node.



Subsections
next up previous
Next: Gaussian node Up: Building Blocks for Variational Previous: Variational Bayesian learning
Tapani Raiko 2006-08-28