The update of a Gaussian node followed by the nonlinearity is similar
to the plain Gaussian node. The source is updated to minimise the
terms of the cost function defined in (), that are
affected. Other parts of the network are considered constant during
the update.
In addition to the terms arising from the variable itself defined in
() and (
), the terms corresponding to the
variables that the output is propagated to are affected. The
gradients of Cp w.r.t.
and
are
assumed to arise from a quadratic term
.
This assumption is shown to be true in Section
.
The update is done by repeating the following steps until they are shorter than some very small constant value.