next up previous contents
Next: Update Rule Up: Building Blocks for Hierarchical Previous: Building Blocks for Hierarchical

   
Gaussian Variables

A Gaussian variable s has two inputs m and v and prior probability $p(s \vert m, v) = \operatorname{N}\left(s;m,\exp(-v)\right)$. The variance is parameterised this way because then the mean and expected exponential of v suffice for computing the cost function. In Appendix A, it is shown that when s, m and v are mutually independent, i.e. q(s, m, v) = q(s)q(m)q(v), $C_{s,p} = -\left< \ln p(s \vert m, v) \right>$ yields

 \begin{displaymath}C_{s,p} = \frac{1}{2}\left\{ \left< \exp v \right>
\left[\le...
...\widetilde{s} \right] -
\left< v \right> + \ln 2\pi\right\} .
\end{displaymath} (4.1)

For observed variables this is the only term in the cost function but for latent variables there is also Cs,q: the part resulting from $\left< \ln q(s) \right>$. The posterior approximation q(s) is defined to be Gaussian with mean $\overline{s}$ and variance $\widetilde{s}$: $q(s) = \operatorname{N}\left(s;\overline{s},\widetilde{s}\right)$. This yields

 \begin{displaymath}C_{s,q} = -\frac{1}{2} \ln 2\pi e \widetilde{s}
\end{displaymath} (4.2)

which is the negative entropy of Gaussian variable with variance $\widetilde{s}$. The parameters $\overline{s}$ and $\widetilde{s}$ are to be optimised during learning.

The output of a latent Gaussian node trivially provides expectation and variance: $\left< s \right> = \overline{s}$ and $\mathrm{Var}\left\{s\right\} = \widetilde{s}$. The expected exponential is

$\displaystyle \left< \exp s \right>$ = $\displaystyle \int{q(s)e^s ds}$ (4.3)
  = $\displaystyle \int (2\pi \widetilde{s})^{-1/2}\exp\left[\frac{-(s-\overline{s})^2}{2\widetilde{s}}+s\right]ds$ (4.4)
  = $\displaystyle \int (2\pi \widetilde{s})^{-1/2}\exp\left[\frac{-(s-\overline{s}-\widetilde{s})^2}{2\widetilde{s}}+\overline{s}+\frac{\widetilde{s}}{2}\right]ds$ (4.5)
  = $\displaystyle \exp(\overline{s}+\widetilde{s}/2).$ (4.6)

The outputs of observed nodes are scalar values instead of distributions and thus $\left< s \right> = s$, $\mathrm{Var}\left\{s\right\} = 0$ and $\left< \exp s \right> = \exp s$.



 
next up previous contents
Next: Update Rule Up: Building Blocks for Hierarchical Previous: Building Blocks for Hierarchical
Tapani Raiko
2001-12-10