As an example of Gaussian parameters we shall consider and
. All the others are handled in essentially the same way
except that there are no weights needed for different states.
To simplify the notation, all the indices from and
are dropped out for the remainder of this section. The relevant terms
of the cost function are now, up to an additive constant
Let us denote
The derivative of this expression with respect to
is easy to evaluate
Setting this to zero gives
The derivative with respect to
The solutions for parameters of are exact. The true
posterior for these parameters is also Gaussian so the approximation
is equal to it. This is not the case for the parameters of
The true posterior for
is not Gaussian. The best Gaussian
approximation with respect to the chosen criterion can still be found
by solving the zero of the derivative of the cost function with
respect to the parameters of
. This is done using Newton's
The derivatives with respect to