Next: Simulations Up: Ensemble Learning for Independent Previous: Model for the measurements

Diagonal Gaussian ensemble

Given the measurements, the unknown variables of the model are the source signals, the mixing matrix, the parameters of the noise and source distributions and the hyperparameters. The posterior P is thus a pdf of all these unknown variables. For notational simplicity, we shall sometimes denote these n variables by $\theta_1, \theta_2, \ldots, \theta_n$ .

In order to make the approximation of the posterior pdf computationally tractable, we shall choose the ensemble Q to be a Gaussian pdf with diagonal covariance. The ensemble has twice as many parameters as there are unknown variables in the model because each dimension of the posterior pdf is parametrised by a mean and variance in the ensemble. A hat over a symbol denotes the mean and a tilde the variance of the corresponding variable.

$\begin{displaymath} Q(\theta_1, \ldots, \theta_n) = \prod_{i=1}^n {\cal G}(\theta_i; \hat{\theta}_i, \tilde{\theta}_i)\end{displaymath}$

The factorised ensemble makes the computation of the Kullback-Leibler information $I_{\mathrm KL}(Q;P)$ simple since the logarithm can be split into a sum of terms: the terms $E_{{\cal G}(\theta_i)}\{\ln {\cal G}(\theta_i)\} = -1/2 \ln 2\pi e \tilde{\theta}_i$ (entropies of Gaussian distributions) and terms $-E_Q\{\ln P_i\}$ , where P_i are the factors of the posterior pdf. Notice that the posterior pdf factorises into simple terms due to the hierarchical structure of the model; the posterior pdf equals to the joint pdf divided by a normalising term.

To see how to compute the terms $-E_Q\{\ln P_i\}$ , let $P_i = p(\gamma_{ij} \vert \Gamma, \epsilon) = {\cal G}(\gamma_{ij}; \Gamma, \epsilon)$ .

$\begin{displaymath} -E_Q\{\ln P_i\} = E_Q\{\frac{(\gamma_{ij} - \Gamma)^2e^{-2\epsilon} + \ln 2\pi}{2} + \epsilon\}\end{displaymath}$ (1)

It is easy to show that this equals to

$\begin{displaymath} \frac{[(\hat{\gamma}_{ij} - \hat{\Gamma})^2 + \tilde{\gamma}... ...de{\epsilon}-\hat{\epsilon})} + \ln 2\pi} {2} + \hat{\epsilon},\end{displaymath}$

since according to the choice of Q, the parameters $\gamma_{ij}$ , $\Gamma$ and $\epsilon$ have independent Gaussian distributions.

The most difficult terms are the expectations of $\ln p(s_i(t) \vert c_i, S_i, \gamma_i)$ . Approximation for the expectation of this form is given in appendix A.

The normalising term in the posterior pdf only depends on those variables which are given, in this case $\{x(t)\}$ , and can therefore be neglected when minimising the Kullback-Leibler information $I_{\mathrm KL}(Q;P)$ .

Next: Simulations Up: Ensemble Learning for Independent Previous: Model for the measurements

Harri Lappalainen
7/10/1998