Bibliography

Next: Cost Function of the Up: Hierarchical Nonlinear Factor Analysis Previous: Discussion

Bibliography

1: H. Abut, editor.
Vector Quantization.
IEEE Press, New York, 1990.
2: H. Attias.
ICA, graphical models and variational methods.
In S. Roberts and R. Everson, editors, Independent Component Analysis: Principles and Practice, pages 95-112. Cambridge University Press, 2001.
3: D. Barber and C. M. Bishop.
Ensemble learning for multi-layer networks.
In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors, Advances in Neural Information Processing Systems 10, NIPS*97, pages 395-401, Denver, Colorado, USA, Dec. 1-6, 1997, 1998. The MIT Press.
4: E. Beale and C. Mallows.
Scale mixing of symmetric distributions with zero means.
In Annals of Mathematical Statistics, 1959.
5: C. M. Bishop.
Neural Networks for Pattern Recognition.
Clarendon Press, Oxford, 1995.
6: C. M. Bishop, M. Svensén, and C. K. I. Williams.
GTM: The generative topographic mapping.
Neural Computation, 10(1):215-234, 1998.
7: C. M. Bishop and J. M. Winn.
Non-linear Bayesian image modelling.
In D. Vernon, editor, Proceedings of the 6th European Conference on Computer Vision, Part I. Lecture Notes in Computer Science, pages 3-17, Dublin, Ireland, 2000. Springer.
8: J.-F. Cardoso.
Multidimensional independent component analysis.
In Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP'98, pages 1941-1944, Seattle, Washington, USA, May 12-15, 1998.
9: C. Chen, editor.
Neural Networks For Pattern Recognition And Their Applications.
World Scientific, Singapore.
10: R. Choudrey, W. Penny, and S. Roberts.
An ensemble learning approach to independent component analysis.
In Proceedings of the IEEE workshop on Neural Networks for Signal Processing, Sydney, Australia, 2000.
11: R. T. Cox.
Probability, frequency and reasonable expectation.
American Journal of Physics, 14(1):1-13, 1946.
12: P. Dayan and R. S. Zemel.
Competition and multiple cause models.
Neural Computation, 7(3):565-579, 1995.
13: A. P. Dempster, N. M. Laird, and D. B. Rubin.
Maximum likelihood from incomplete data via the EM algorithm.
Journal of the Royal Statistical Society (Series B), 39:1-38, 1977.
14: P. Földiak and M. P. Young.
Sparse coding in the primate cortex.
In The Handbook of Brain Theory and Neural Networks, pages 895-898, Cambridge, Massachusetts, 1995. The MIT Press.
15: B. Frey and G. E. Hinton.
Variational learning in nonlinear Gaussian belief networks.
Neural Computation, 11(1):193-214, 1999.
16: B. Frey and N. Jojic.
Estimating mixture models of images and inferring spatial transformations using the EM algorithm.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 416-422, 1999.
17: K.-I. Funahashi.
On the approximate realization of continuous mappings by neural networks.
In Neural Networks, vol. 2, pages 183-192, 1989.
18: A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin.
Bayesian Data Analysis.
Chapman & Hall, New York, 1995.
19: Z. Ghahramani and G. E. Hinton.
Hierarchical non-linear factor analysis and topographic maps.
In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors, Advances in Neural Information Processing Systems 10, NIPS*97, pages 486-492, Denver, Colorado, USA, Dec. 1-6, 1997, 1998. The MIT Press.
20: Z. Ghahramani and S. T. Roweis.
Learning nonlinear dynamical systems using an EM algorithm.
In M. S. Kearns, S. A. Solla, and D. A. Cohn, editors, Advances in Neural Information Processing Systems 11, NIPS*98, pages 599-605, Denver, Colorado, USA, Nov. 30-Dec. 5, 1998, 1999. The MIT Press.
21: R. C. Gonzales and R. E. Woods.
Digital Image Processing.
Addison-Wesley, 3rd edition, 1992.
22: H. Harman.
Modern Factor Analysis.
University of Chicago Press, 2nd edition, 1967.
23: T. Hastie and W. Stuetzle.
Principal curves.
Journal of the American Statistical Association, 84:502-516, 1989.
24: S. Haykin.
Neural Networks -- A Comprehensive Foundation.
Prentice Hall, 2nd edition, 1998.
25: G. E. Hinton and D. van Camp.
Keeping neural networks simple by minimizing the description length of the weights.
In Proceedings of the Sixth Annual ACM Conference on Computational Learning Theory, pages 5-13, Santa Cruz, California, USA, July 26-28, 1993.
26: K. Hornik, M. Stinchcombe, and H. White.
Multilayer feedforward networks are universal approximators.
Neural Networks, 2(5):359-366, 1989.
27: D. Hubel and T. Wiesel.
Receptive fields, binocular interaction and functional architecture in the cat's visual cortex.
Journal of Physiology of London, 160:106-154, 1962.
28: A. Hyvärinen.
Fast and robust fixed-point algorithms for independent component analysis.
IEEE Transactions on Neural Networks, 10(3):626-634, 1999.
29: A. Hyvärinen and P. O. Hoyer.
Emergence of phase and shift invariant features by decomposition of natural images into independent feature subspaces.
Neural Computation, 12(7):1705-1720, 2000.
30: A. Hyvärinen and P. O. Hoyer.
Emergence of topography and complex cell properties from natural images using extensions of ICA.
In S. A. Solla, T. K. Leen, and K.-R. Müller, editors, Advances in Neural Information Processing Systems 12, NIPS*99, pages 827-833, Denver, Colorado, USA, Nov. 29 - Dec. 4, 1999, 2000. The MIT Press.
31: A. Hyvärinen, J. Karhunen, and E. Oja.
Independent Component Analysis.
John Wiley & Sons, 2001.
32: I. T. Jolliffe.
Principal Component Analysis.
Springer-Verlag, 1986.
33: M. I. Jordan, editor.
Learning in Graphical Models.
The MIT Press, Cambridge, Massachusetts, 1999.
34: M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul.
An introduction to variational methods for graphical models.
In Jordan [33], pages 105-161.
35: C. Jutten and J. Herault.
Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture.
Signal Processing, 24:1-10, 1991.
36: J. Karhunen, S. Malaroiu, and M. Ilmoniemi.
Local linear independent component analysis using clustering.
International Journal of Neural Systems, 10(6):439-451, 2000.
37: R. E. Kass and L. Wasserman.
Formal rules for selecting prior distributions: A review and annotated bibliography.
Technical Report #583, Carnegie Mellon University, PA, 1994.
38: M. Kendall.
Multivariate Analysis.
Charles Griffin & Co., 1975.
39: T. Kohonen.
Self-Organizing Maps.
Springer-Verlag, Berlin, 3rd, extended edition, 2001.
40: T. Kohonen, S. Kaski, and H. Lappalainen.
Self-organized formation of various invariant-feature filters in the Adaptive-Subspace SOM.
Neural Computation, 9(6):1321-1344, 1997.
41: A. Krogh and J. A. Hertz.
A simple weight decay can improve generalization.
In J. E. Moody, S. J. Hanson, and R. P. Lippmann, editors, Advances in Neural Information Processing Systems, volume 4, pages 950-957, San Mateo, 1992. Morgan Kaufmann Publishers.
42: J. Lampinen and A. Vehtari.
Bayesian approach for neural networks - review and case studies.
Neural Networks, 14(3):7-24, 2001.
43: H. Lappalainen and A. Honkela.
Bayesian nonlinear independent component analysis by multi-layer perceptrons.
In M. Girolami, editor, Advances in Independent Component Analysis, pages 93-121. Springer-Verlag, Berlin, 2000.
44: H. Lappalainen and J. W. Miskin.
Ensemble learning.
In M. Girolami, editor, Advances in Independent Component Analysis, pages 76-92. Springer-Verlag, Berlin, 2000.
45: T. Lee and M. Lewicki.
The generalized Gaussian mixture model using ICA.
In Proceedings of the 2nd International Workshop on Independent Component Analysis and Blind Source Separation (ICA2000), pages 239-244, Espoo, Finland, 2000.
46: T. Lee, M. Lewicki, M. Girolami, and T. Sejnowski.
Blind source separation of more sources than mixtures using overcomplete representations.
IEEE Signal Processing Letters, 6:87-90, 1999.
47: J. C. Lemm.
Prior information and generalized questions.
Technical Report A.I. Memo No. 1598, Massachusetts Institute of Technology, 1996.
48: D. J. C. MacKay.
Ensemble learning for hidden Markov models.
Available from http://wol.ra.phy.cam.ac.uk/, 1997.
49: J. Miskin and D. MacKay.
Ensemble Learning for blind source separation.
In S. Roberts and R. Everson, editors, Independent Component Analysis: Principles and Practice, pages 209-233. Cambridge University Press, 2001.
50: K. P. Murphy.
A variational approximation for Bayesian networks with discrete and continuous latent variables.
In Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-99), pages 457-466, 1999.
51: R. M. Neal.
Connectionist learning of belief networks.
Artificial Intelligence, 56(1):71-113, 1992.
52: R. M. Neal and G. E. Hinton.
A view of the EM algorithm that justifies incremental, sparse, and other variants.
In Jordan [33], pages 355-368.
53: E. Oja.
PCA, ICA and nonlinear Hebbian learning.
In Proceedings of the ICANN'95, pages 80-94, 1995.
54: E. Oja and J. Karhunen.
Signal separation by nonlinear Hebbian learning.
In Proceedings of the IEEE International Conference on Neural Networks, pages 83-87, 1995.
55: L. Parra, C. Spence, and P. Sajda.
Higher-order statistical properties arising from the non-stationarity of natural signals.
In Advances in Neural Information Processing Systems 13, Denver, 2000.
56: D.-T. Pham and J.-F. Cardoso.
Blind separation of instantaneous mixtures of non stationary sources.
In P. Pajunen and J. Karhunen, editors, Proceedings of the Second International Workshop on Independent Component Analysis and Blind Signal Separation, ICA 2000, pages 187-192, Helsinki, Finland, June 19-22, 2000.
57: T. Raiko and H. Valpola.
Missing values in nonlinear factor analysis.
In L. Zhang and F. Gu, editors, Proceedings of the 8th International Conference on Neural Information Processing, ICONIP 2001, volume 2, pages 822-827, Shanghai, China, 2001. Fudan University Press.
58: S. Roberts and R. Everson.
Introduction.
In S. Roberts and R. Everson, editors, Independent Component Analysis: Principles and Practice, pages 1-70. Cambridge University Press, 2001.
59: M. Schervish.
Theory of Statistics.
Springer, New York, 1995.
60: B. Schölkopf, A. Smola, and K. Müller.
Nonlinear component analysis as a kernel eigenvalue problem.
Neural Computation, 10:1299-1319, 1998.
61: H. Valpola.
Bayesian Ensemble Learning for Nonlinear Factor Analysis.
PhD thesis, Helsinki University of Technology, Espoo, Finland, 2000.
Published in Acta Polytechnica Scandinavica, Mathematics and Computing Series No. 108.
62: H. Valpola.
Nonlinear independent component analysis using ensemble learning: Theory.
In P. Pajunen and J. Karhunen, editors, Proceedings of the Second International Workshop on Independent Component Analysis and Blind Signal Separation, ICA 2000, pages 251-256, Helsinki, Finland, June 19-22, 2000.
63: H. Valpola.
Unsupervised learning of nonlinear dynamic state-space models.
Publications in Computer and Information Science A59, Helsinki University of Technology, Espoo, Finland, 2000.
64: H. Valpola, A. Honkela, and J. Karhunen.
Nonlinear static and dynamic blind source separation using ensemble learning.
In Proceedings of the International Joint Conference on Neural Networks (IJCNN'01), Washington D.C., USA, 2001.
65: H. Valpola and J. Karhunen.
An unsupervised ensemble learning method for nonlinear dynamic state-space models.
Neural Computation.
66: H. Valpola, T. Raiko, and J. Karhunen.
Building blocks for hierarchical latent variable models.
In Proceedings of the 3rd International Conference on Independent Component Analysis ans Blind Signal Separation, ICA 2001, San Diego, California, USA, December 9-12, 2001.
In press.
67: M. Wainwright and E. Simoncelli.
Scale mixtures of Gaussians and the statistics of natural images.
In Advances in Neural Information Processing Systems 12, Cambridge, 2000.
68: H. Wechsler, P. Phillips, V. Bruce, F. Soulie, and T. Huang.
Face recognition: From theory to applications, 1998.

Tapani Raiko
2001-12-10