In a hierarchical model, the data
X depends only on some of the
parameters. They are called the parameters of the first layer
.
Parameters of the first layer depend only on the second
layer parameters
and so on.
The term
in equation (
) can be split
into a product of simpler terms since dependencies over layers
can be truncated
![]() |
= | ![]() |
(3.12) |
= | ![]() |
This means that also the expectation in () and thus the
whole cost function C in (
) become sums of simple
terms.