Several algorithms for independent component analysis have been proposed in the literature [61,17,95,51,3,52]. Many of the algorithms use the signal transformation approach with the criterion that the resulting sources be as statistically independent as possible. These algorithms can be seen as extensions of PCA. Simplifying assumptions can provide very efficient algorithms, such as FastICA [55,51]. Some of the algorithms use the natural gradient which was discussed in section 4.5.4.

MAP estimation does not work for the IFA model unless the linear
mapping
**A** is suitably restricted. This is because the width
of the peak of the posterior probability density of the factors, and
thus the posterior probability mass, depends on the matrix
**A**.
However, it is possible to take into account the width of the
posterior. If the variance of noise
**n**(*t*) is assumed to be
the same for all observations, then the posterior volume is
proportional to the posterior density and inversely proportional to
the determinant
|**A**|. This has been used in
[99], although the method is given a different
interpretation.

In most algorithms, a point estimate is used for the linear mapping
**A**. Hyvärinen *et al.* have shown that many of these
algorithms suffer from overfitting [56].
Publication II describes how ensemble
learning can be applied to IFA. Point estimates are not used for any
parameters and therefore the algorithm is not prone to overfitting.
In publication VII, a new interpretation
of the FastICA algorithm is given, which allows one to use the same
idea with ensemble learning, thus yielding a Bayesian version of the
FastICA algorithm. The treatment of the posterior factor
distributions is improved from publication
II by utilising the method applied in [3].