Next: Bibliography Up: Principal Component Analysis for Previous: Overfitting

Discussion

We studied a number of different methods for PCA with sparse data and it turned out that a simple gradient descent approach worked best due to its minimal computational complexity per iteration. We could also speed it up more than ten times by using an approximated Newton's method. We found out empirically that setting the parameter $\alpha=2/3$ seems to work well for our problem. It is left for future work to find out whether this generalizes to other problem settings. There are also many other ways to speed-up the gradient descent algorithm. The natural gradient did not help here, but we expect that the conjugate gradient method would. The modification to the gradient proposed in this paper, could be used together with the conjugate gradient speed-up. This will be another future research topic.

There are also other benefits in solving the PCA problem by gradient descent. Algorithms that minimize an explicit cost function are rather easy to extend. The case of variational Bayesian learning applied to PCA was considered in Section 4, but there are many other extensions of PCA, such as using non-Gaussianity, non-linearity, mixture models, and dynamics.

The developed algorithms can prove useful in many applications such as bioinformatics, speech processing, and meteorology, in which large-scale datasets with missing values are very common. The required computational burden is linearly proportional to the number of measured values. Note also that the proposed techniques provide an analogue of confidence regions showing the reliability of estimated quantities.

Acknowledgments This work was supported in part by the Academy of Finland under its Centers for Excellence in Research Program, and the IST Program of the European Community, under the PASCAL Network of Excellence, IST-2002-506778. This publication only reflects the authors' views. We would like to thank Antti Honkela for useful comments.

Next: Bibliography Up: Principal Component Analysis for Previous: Overfitting

Tapani Raiko 2007-09-11