The update of a Gaussian node followed by the nonlinearity is similar
to the plain Gaussian node. The source is updated to minimise the
terms of the cost function defined in (
), that are
affected. Other parts of the network are considered constant during
the update.
In addition to the terms arising from the variable itself defined in
(
) and (
), the terms corresponding to the
variables that the output is propagated to are affected. The
gradients of Cp w.r.t.
and
are
assumed to arise from a quadratic term
.
This assumption is shown to be true in Section
.
The update is done by repeating the following steps until they are shorter than some very small constant value.