Fixed-point iteration for the variance

Next: Summary of the updating Up: Updating for the Gaussian Previous: Newton iteration for the

Fixed-point iteration for the variance

A simple fixed-point iteration rule is obtained for the variance

by solving the zero of the derivative:

0	$\displaystyle = \frac{\partial {\cal C}(m, v)}{\partial v} = V + \frac{E}{2} \exp(m + v/2) - \frac{1}{2v} \Leftrightarrow$
$\displaystyle v$	$\displaystyle = \frac{1}{2V + E \exp(m + v/2)} \stackrel{\mathit{def}}{=}g(v)$	(56)

$\displaystyle v_{i+1} = g(v_i)$

(57)

In general, fixed-point iterations are stable around the solution $v_{\mathrm{opt}}$ if $\vert g'(v_{\mathrm{opt}})\vert < 1$ and converge best when the derivative $g'(v_{\mathrm{opt}})$ is near zero. In our case

is always negative and can be less than

. In this case the solution can be an unstable fixed-point. This can be avoided by taking a weighted average of (57) and a trivial iteration $v_{i+1} = v_i$ :

$\displaystyle v_{i+1} = \frac{\xi(v_i) g(v_i) + v_i}{\xi(v_i) + 1} \stackrel{\mathit{def}}{=}f(v_i)$

(58)

The weight $\xi$ should be such that the derivative of

is close to zero at the optimal solution $v_{\mathrm{opt}}$ which is achieved exactly when $\xi(v_{\mathrm{opt}}) = -g'(v_{\mathrm{opt}})$ .

It holds

$\displaystyle g'(v)$	$\displaystyle = -\frac{(E/2) \exp(m +v/2)}{\left[2V + E \exp(m+v/2)\right]^2} =... ...V - \frac{1}{2g(v)}\right] = g(v) \left[V g(v) - \frac{1}{2}\right] \Rightarrow$
$\displaystyle g'(v_{\mathrm{opt}})$	$\displaystyle = v_{\mathrm{opt}}\left[V v_{\mathrm{opt}}- \frac{1}{2}\right] \R... ...{\mathrm{opt}}) = v_{\mathrm{opt}}\left[\frac{1}{2} - V v_{\mathrm{opt}}\right]$	(59)

The last steps follow from the fact that $v_{\mathrm{opt}}= g(v_{\mathrm{opt}})$ and from the requirement that $f'(v_{\mathrm{opt}}) = 0$ . We can assume that

is close to $v_{\mathrm{opt}}$ and use

$\displaystyle \xi(v) = v \left[\frac{1}{2} - V v_{\mathrm{opt}}\right] \, .$

(60)

Note that the iteration (57) can only yield estimates with $0 < v_{i+1} < 1/2V$ which means that $\xi(v_{i+1}) > 0$ . Therefore the use of $\xi$ always shortens the step taken in (58). If the initial estimate

, we can set it to

Next: Summary of the updating Up: Updating for the Gaussian Previous: Newton iteration for the

Tapani Raiko 2006-08-28