A more simple case is when the maximum likelihood estimate is used for some on the parameters. This can be done by the EM-algorithm where the computation alternates between computing the posterior distribution of one set of variables given the current point estimate of the other set of variables (E-step) and then using the posterior distribution of the first set of variables to compute a new maximum likelihood estimate of the second set of variables (M-step).

When EM-algorithm is applied to ICA, usually the full posterior
distribution is computed for sources and the maximum likelihood
estimate is used for the rest of the parameters. This means that in
the E-step we need to compute the posterior distribution of the
sources **s** given **x**,**A** and the noise
covariance

and use it to update our estimates.

Using the matrix notation for the finite number of samples, i.e. **X** and **S**, we can
write the M-step (see [9]) re-estimation for the mixing matrix as

where the posterior correlation matrices are

The expectations are taken over the posterior distribution of the sources.

We will consider here the case where
is small. If we
further assume that the mixtures are prewhitened, we can constrain the
mixing matrix to be orthogonal and we can assume that the sources have
unit variance. This makes
**R**_{ss} a unit matrix.

In [2] the EM-algorithm is derived as a low-noise approximation for the case
of square mixing matrix **A**. First, the posterior mean
is obtained as

where is the derivative and

Substituting the above approximations we get

As the authors mention in [2], this approximation leads to an EM-algorithm which converges slowly with low noise variance . They also point out that there is no visible ``noise-correction''. It is precisely this point that we will address in the next section.