Several algorithms for independent component analysis have been proposed in the literature [61,17,95,51,3,52]. Many of the algorithms use the signal transformation approach with the criterion that the resulting sources be as statistically independent as possible. These algorithms can be seen as extensions of PCA. Simplifying assumptions can provide very efficient algorithms, such as FastICA [55,51]. Some of the algorithms use the natural gradient which was discussed in section 4.5.4.
MAP estimation does not work for the IFA model unless the linear mapping A is suitably restricted. This is because the width of the peak of the posterior probability density of the factors, and thus the posterior probability mass, depends on the matrix A. However, it is possible to take into account the width of the posterior. If the variance of noise n(t) is assumed to be the same for all observations, then the posterior volume is proportional to the posterior density and inversely proportional to the determinant |A|. This has been used in , although the method is given a different interpretation.
In most algorithms, a point estimate is used for the linear mapping A. Hyvärinen et al. have shown that many of these algorithms suffer from overfitting . Publication II describes how ensemble learning can be applied to IFA. Point estimates are not used for any parameters and therefore the algorithm is not prone to overfitting. In publication VII, a new interpretation of the FastICA algorithm is given, which allows one to use the same idea with ensemble learning, thus yielding a Bayesian version of the FastICA algorithm. The treatment of the posterior factor distributions is improved from publication II by utilising the method applied in .