next up previous
Next: Variance of a function Up: Computing the expectation of Previous: Expectation of the description

Expectation of a function

Equations 6 and 7 show that we could compute the expected description length if we knew how to evaluate the expectations of the functions fi.

Let $f_i(\xi_j \vert j \in \mathcal{J}_i)$ be a function whose expectation value we would like to compute. As before, we shall approximate fi with its second order Taylor's series expansion. However, we are not going to expand fi with respect to the discretised parameters $\boldsymbol{\hat{\theta}}$ of the network but the direct parameters $\xi_j$ of the function fi. The expansion is thus computed about the expectation values $\mu_j = E\{\xi_j\}$.
 \begin{multline}
 \xi_i = f_i(\xi_j \vert j \in \mathcal{J}_i) \approx f_i(\mu_j...
 ...i}{\partial \xi_j \partial \xi_k} (\xi_j - \mu_j) (\xi_k
 - \mu_k)\end{multline}
Taking expectation from both sides of equation 8 yields
 \begin{multline}
 \mu_i = E\{f_i(\xi_j \vert j \in \mathcal{J}_i)\} \approx \\  ...
 ...al{J}_i} \frac{v_j}{2}
 \frac{\partial^2 f_i}{\partial {\xi_j}^2}.\end{multline}
Here we have denoted the variance of $\xi_j$ by vj: $v_j \stackrel{\mathit{def}}{=}
E\{(\xi_j - \mu_j)^2\}$. The first order terms disappear because $E\{\xi_j - \mu_j\} = 0$. We have also assumed that for all $j \neq
k$, either $\xi_j$ and $\xi_k$ are uncorrelated or $\frac{\partial^2 f_i}{\partial \xi_j \partial \xi_k} = 0$, which removes the second order cross terms.



Harri Lappalainen
5/19/1998