next up previous
Next: Summary of the updating Up: Updating for the Gaussian Previous: Newton iteration for the

Fixed-point iteration for the variance $ v$

A simple fixed-point iteration rule is obtained for the variance $ v$ by solving the zero of the derivative:

0 $\displaystyle = \frac{\partial {\cal C}(m, v)}{\partial v} = V + \frac{E}{2} \exp(m + v/2) - \frac{1}{2v} \Leftrightarrow$    
$\displaystyle v$ $\displaystyle = \frac{1}{2V + E \exp(m + v/2)} \stackrel{\mathit{def}}{=}g(v)$ (56)

$\displaystyle v_{i+1} = g(v_i)$ (57)

In general, fixed-point iterations are stable around the solution $ v_{\mathrm{opt}}$ if $ \vert g'(v_{\mathrm{opt}})\vert < 1$ and converge best when the derivative $ g'(v_{\mathrm{opt}})$ is near zero. In our case $ g'(v_i)$ is always negative and can be less than $ -1$. In this case the solution can be an unstable fixed-point. This can be avoided by taking a weighted average of (57) and a trivial iteration $ v_{i+1} = v_i$:

$\displaystyle v_{i+1} = \frac{\xi(v_i) g(v_i) + v_i}{\xi(v_i) + 1} \stackrel{\mathit{def}}{=}f(v_i)$ (58)

The weight $ \xi$ should be such that the derivative of $ f$ is close to zero at the optimal solution $ v_{\mathrm{opt}}$ which is achieved exactly when $ \xi(v_{\mathrm{opt}}) = -g'(v_{\mathrm{opt}})$.

It holds

$\displaystyle g'(v)$ $\displaystyle = -\frac{(E/2) \exp(m +v/2)}{\left[2V + E \exp(m+v/2)\right]^2} =...
...V - \frac{1}{2g(v)}\right] = g(v) \left[V g(v) - \frac{1}{2}\right] \Rightarrow$    
$\displaystyle g'(v_{\mathrm{opt}})$ $\displaystyle = v_{\mathrm{opt}}\left[V v_{\mathrm{opt}}- \frac{1}{2}\right] \R...
...{\mathrm{opt}}) = v_{\mathrm{opt}}\left[\frac{1}{2} - V v_{\mathrm{opt}}\right]$ (59)

The last steps follow from the fact that $ v_{\mathrm{opt}}= g(v_{\mathrm{opt}})$ and from the requirement that $ f'(v_{\mathrm{opt}}) = 0$. We can assume that $ v$ is close to $ v_{\mathrm{opt}}$ and use

$\displaystyle \xi(v) = v \left[\frac{1}{2} - V v_{\mathrm{opt}}\right] \, .$ (60)

Note that the iteration (57) can only yield estimates with $ 0 < v_{i+1} < 1/2V$ which means that $ \xi(v_{i+1}) > 0$. Therefore the use of $ \xi$ always shortens the step taken in (58). If the initial estimate $ v_0 > 1/2V$, we can set it to $ v_0 = 1/2V$.
next up previous
Next: Summary of the updating Up: Updating for the Gaussian Previous: Newton iteration for the
Tapani Raiko 2006-08-28