next up previous
Next: Learning Up: Combining the nodes Previous: Linear dynamic models for

Hierarchical priors

It is often desirable that the priors of the parameters should not be too restrictive. A common type of a vague prior is the hierarchical prior Gelman95. For example the priors of the elements $ a_{ij}$ of a mixing matrix $ {\mathbf{A}}$ can be defined via the Gaussian distributions

$\displaystyle p( a_{ij} \mid v^a_i )$ $\displaystyle = \mathcal N( a_{ij} ; 0, \exp(-v^a_i) )$ (49)
$\displaystyle p( v^a_i \mid m^{va}, v^{va} )$ $\displaystyle = \mathcal N( v^a_i ; m^{va}, \exp(-v^{va}) ) \, .$ (50)

Finally, the priors of the quantities $ m^{va}$ and $ v^{va}$ have flat Gaussian distributions $ \mathcal N(\cdot ; 0,100)$ (the constants depending on the scale of the data). When going up in the hierarchy, we use the same distribution for each column of a matrix and for each component of a vector. On the top, the number of required constant priors is small. Thus very little information is provided and needed a priori. This kind of hierarchical priors are used in the experiments later on this paper.


Tapani Raiko 2006-08-28