next up previous
Next: Nonlinearity node Up: Variational Bayesian inference in Previous: Updating the posterior distribution


Addition and multiplication nodes

Consider first the addition node. The mean, variance and expected exponential of the output of the addition node can be evaluated in a straightforward way. Assuming that the inputs $ s_i$ are statistically independent, these expectations are respectively given by

$\displaystyle \left< \sum_{i=1}^n s_i \right>$ $\displaystyle = \sum_{i=1}^n \left< s_i \right>$ (20)
$\displaystyle \mathrm{Var}\left\{\sum_{i=1}^n s_i\right\}$ $\displaystyle = \sum_{i=1}^n \mathrm{Var}\left\{s_i\right\}$ (21)
$\displaystyle \left< \exp \left( \sum_{i=1}^n s_i \right) \right>$ $\displaystyle = \prod_{i=1}^n \left< \exp s_i \right>$ (22)

The proof has been given in Appendix B.1.

Consider then the multiplication node. Assuming independence between the inputs $ s_i$, the mean and the variance of the output take the form (see Appendix B.1)

$\displaystyle \left< \prod_{i=1}^n s_i \right>$ $\displaystyle = \prod_{i=1}^n \left< s_i \right>$ (23)
$\displaystyle \mathrm{Var}\left\{\prod_{i=1}^n s_i\right\}$ $\displaystyle = \prod_{i=1}^n \left[ \left< s_i \right>^2 + \mathrm{Var}\left\{s_i\right\} \right] - \prod_{i=1}^n \left< s_i \right>^2$ (24)

For the multiplication node the expected exponential cannot be evaluated without knowing the exact distribution of the inputs.

The formulas (20)-(24) are given for $ n$ inputs because of generality, but in practice we have carried out the needed calculations pairwise. When using the general formula (24), the variance might otherwise occasionally take a small negative value due to minor imprecisions appearing in the computations. This problem does not arise in pairwise computations. Now, the propagation in the forward direction is covered.

The form of the cost function propagating from children to parents is assumed to be of the form (17). This is true even in the case, where there are addition and multiplication nodes in between (see Appendix B.2 for proof). Therefore only the gradients of the cost function with respect to the different expectations need to be propagated backwards to identify the whole cost function w.r.t. the parent. The required formulas are obtained in a straightforward manner from Eqs. (20)-(24). The gradients for the addition node are:

$\displaystyle \frac{\partial C}{\partial \left< s_1 \right>} = \frac{\partial C}{\partial \left< s_1 + s_2 \right>}$ (25)

$\displaystyle \frac{\partial C}{\partial \mathrm{Var}\left\{s_1\right\}} = \frac{\partial C}{\partial \mathrm{Var}\left\{s_1 + s_2\right\}}$ (26)

$\displaystyle \frac{\partial C}{\partial \left< \exp s_1 \right>} = \left< \exp s_2 \right>\frac{\partial C}{\partial \left< \exp (s_1+s_2) \right>}.$ (27)

For the multiplication node, they become

$\displaystyle \frac{\partial C}{\partial \left< s_1 \right>}$ $\displaystyle = \left< s_2 \right> \frac{\partial C}{\partial \left< s_1 s_2 \r...
...rac{\partial C}{\partial \mathrm{Var}\left\{s_1 s_2\right\}} \left< s_1 \right>$ (28)
$\displaystyle \frac{\partial C}{\partial \mathrm{Var}\left\{s_1\right\}}$ $\displaystyle = \left(\left< s_2 \right>^2 + \mathrm{Var}\left\{s_2\right\}\right)\frac{\partial C}{\partial \mathrm{Var}\left\{s_1 s_2\right\}}.$ (29)

As a conclusion, addition and multiplication nodes can be added between the Gaussian nodes whose costs still retain the form (17). Proofs can be found in Appendices B.1 and B.2.


next up previous
Next: Nonlinearity node Up: Variational Bayesian inference in Previous: Updating the posterior distribution
Tapani Raiko 2006-08-28