next up previous contents
Next: About this document ... Up: Bayesian Ensemble Learning for Previous: Future trends

REFERENCES

1
S. Amari, Differential-Geometrical Methods in Statistics.
Springer-Verlag, 2nd ed., 1990.

2
S. Amari, ``Natural gradient works efficiently in learning,'' Neural Computation, vol. 10, no. 2, pp. 251-276, 1998.

3
H. Attias, ``Independent factor analysis,'' Neural Computation, vol. 11, no. 4, pp. 803-851, 1999.

4
H. B. Barlow, ``Cerebral cortex as model builder,'' in Models of the visual cortex (D. Rose and V. G. Dobson, eds.), pp. 37-46, John Wiley & Sons, 1985.

5
A. Basilevsky, Statistical Factor Analysis and Related Methods: Theory and Applications.
John Wiley & Sons, 1994.

6
R. A. Baxter and J. J. Oliver, ``MDL and MML: Similarities and differences,'' Tech. Rep. TR 207, Department of Computer Science, Monash University, Australia, 1994.

7
J. M. Bernardo and A. F. M. Smith, Bayesian Theory.
Wiley, 1994.

8
C. M. Bishop, Neural Networks for Pattern Recognition.
Clarendon Press, 1995.

9
C. M. Bishop, ``Bayesian PCA,'' in Advances in Neural Information Processing Systems 11, NIPS*98, (Denver, Colorado, USA, Nov. 30-Dec. 5, 1998), pp. 382-388, The MIT Press, 1999.

10
C. M. Bishop, M. Svensén, and C. K. I. Williams, ``GTM: The generative topographic mapping,'' Neural Computation, vol. 10, no. 1, pp. 215-234, 1998.

11
G. Boole, An Investigation of the Laws of Thought.
Walton and Maberley, 1854.

12
T. Briegel and V. Tresp, ``Fisher scoring and a mixture of modes approach for approximate inference and learning in nonlinear state space models,'' in Advances in Neural Information Processing Systems 11, NIPS*98, (Denver, Colorado, USA, Nov. 30-Dec. 5, 1998), pp. 403-409, The MIT Press, 1999.

13
G. Burel, ``Blind separation of sources: A nonlinear neural algorithm,'' Neural Networks, vol. 5, no. 6, pp. 937-947, 1992.

14
J.-F. Cardoso, ``Multidimensional independent component analysis,'' in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP'98, (Seattle, Washington, USA, May 12-15), pp. 1941-1944, 1998.

15
G. J. Chaitin, ``On the length of programs for computing finite binary sequences,'' Journal of the ACM, vol. 13, no. 4, pp. 547-569, 1966.

16
A. Cichocki, L. Zhang, S. Choi, and S. Amari, ``Nonlinear dynamic independent component analysis using state-space and neural network models,'' in Proceedings of the First International Workshop on Independent Component Analysis and Signal Separation, ICA'99, (Aussois, France, Jan. 11-15), pp. 99-104, 1999.

17
P. Comon, ``Independent component analysis -- a new concept?,'' Signal Processing, vol. 36, pp. 287-314, 1994.

18
T. M. Cover and J. A. Thomas, Elements of Information Theory.
Wiley & Sons, 1991.

19
R. T. Cox, ``Probability, frequency and reasonable expectation,'' American Journal of Physics, vol. 14, no. 1, pp. 1-13, 1946.

20
G. Deco and W. Brauer, ``Nonlinear higher-order statistical decorrelation by volume-conserving neural architecture,'' Neural Networks, vol. 8, no. 4, pp. 525-535, 1995.

21
A. P. Dempster, N. M. Laird, and D. B. Rubin, ``Maximum likelihood from incomplete data via the EM algorithm,'' Journal of the Royal Statistical Society (Series B), vol. 39, pp. 1-38, 1977.

22
D. C. Dennet, Consciousness Explained.
Little, Brown and Co., 1991.

23
H. Dürer and T. Waschulzik, ``ESyNN -- a model to abstractly emulate synchronization in neural networks,'' in Proceedings of the Ninth International Conference on Artificial Neural Networks, ICANN'99, (Edinburgh, UK, Sep. 7-10), pp. 791-796, 1999.

24
R. Eckhorn, R. Bauer, W. Jordan, M. Brosch, W. Kruse, M. Munk, and H. J. Reitboeck, ``Coherent oscillations: A mechanism of feature linking in the visual cortex? Multiple electrode and correlation analyses in the cat,'' Biological Cybernetics, vol. 60, pp. 121-130, 1989.

25
B. Everitt, ed., An Introduction to Latent Variable Models.
Chapman and Hall, 1984.

26
D. J. Felleman and D. C. V. Essen, ``Distributed hierarchical processing in the primate cerebral cortex,'' Cerebral Cortex, vol. 1, no. 1, pp. 1-47, 1991.

27
W. T. Freeman, ``The generic viewpoint assumption in a Bayesian framework,'' in Knill and Richards [64], pp. 365-389, 1996.

28
A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin, Bayesian Data Analysis.
Chapman & Hall, 1995.

29
Z. Ghahramani and G. E. Hinton, ``Hierarchical non-linear factor analysis and topographic maps,'' in Advances in Neural Information Processing Systems 10, NIPS*97, (Denver, Colorado, USA, Dec. 1-6, 1997), pp. 486-492, The MIT Press, 1998.

30
Z. Ghahramani and G. E. Hinton, ``Variational learning for switching state-space models,'' Neural Computation, vol. 12, no. 4, pp. 963-996, 2000.

31
Z. Ghahramani and S. T. Roweis, ``Learning nonlinear dynamical systems using an EM algorithm,'' in Advances in Neural Information Processing Systems 11, NIPS*98, (Denver, Colorado, USA, Nov. 30-Dec. 5, 1998), pp. 599-605, The MIT Press, 1999.

32
D. C. Gilbert, ``Circuitry, architecture, and functional dynamics of visual cortex,'' Cerebral Cortex, vol. 3, no. 5, pp. 373-386, 1993.

33
W. R. Gilks, S. Richardson, and D. J. Spiegelhalter, eds., Markov Chain Monte Carlo in Practice.
Chapman & Hall, 1996.

34
M. Girolami, Self-Organising Neural Networks -- Independent Component Analysis and Blind Source Separation.
Springer-Verlag, 1999.

35
R. L. Gorsuch, Factor Analysis.
Lawrence Earlbaum Associates, 2nd ed., 1983.

36
C. M. Gray, ``Synchronous oscillations in neuronal systems, mechanisms and functions,'' Journal of Computational Neuroscience, vol. 1, pp. 11-39, 1994.

37
C. M. Gray and W. Singer, ``Stimulus-specific neuronal oscillations in orientation columns of cat visual cortex,'' Proc. Natl. Acad. Sci, vol. 86, pp. 1698-1702, 1989.

38
M. S. Grewal and A. P. Andrews, Kalman Filtering.
Prentice-Hall, 1993.

39
S. Haykin, Neural Networks -- A Comprehensive Foundation.
Prentice Hall, 2nd ed., 1998.

40
R. Hecht-Nielsen, ``Replicator neural networks for universal optimal source coding,'' Science, vol. 269, pp. 1860-1863, 1995.

41
R. Herken, ed., The Universal Turing Machine: a Half-Century Survey.
Oxford University Press, 1988.

42
M. Herrmann and H. H. Yang, ``Perspectives and limitations of self-organising maps in blind separation of source signals,'' in Progress in Neural Information Processing, Proc. ICONIP'96, (Wan Chai, Hong Kong, Sep. 24-27), pp. 1211-1216, Springer-Verlag, 1996.

43
G. E. Hinton and T. J. Sejnowski, eds., Unsupervised Learning: Foundations of Neural Computation.
Computational Neuroscience Series, The MIT Press, 1999.

44
G. E. Hinton and D. van Camp, ``Keeping neural networks simple by minimizing the description length of the weights,'' in Proceedings of the COLT'93, (Santa Cruz, California, USA, July 26-28), pp. 5-13, 1993.

45
S. Hochreiter and M. C. Mozer, ``An electric field approach to independent component analysis,'' in Proceedings of the Second International Workshop on Independent Component Analysis and Blind Signal Separation, ICA 2000, (Helsinki, Finland, June 19-22), pp. 45-50, 2000.

46
S. Hochreiter and J. Schmidhuber, ``Flat minima,'' Neural Computation, vol. 9, no. 1, pp. 1-42, 1997.

47
S. Hochreiter and J. Schmidhuber, ``Feature extraction through LOCOCODE,'' Neural Computation, vol. 11, no. 3, pp. 679-714, 1999.

48
S. Hochreiter and J. Schmidhuber, ``LOCOCODE performs nonlinear ICA without knowing the number of sources,'' in Proceedings of the First International Workshop on Independent Component Analysis and Signal Separation, ICA'99, (Aussois, France, Jan. 11-15), pp. 149-154, 1999.

49
K. Hornik, M. Stinchcombe, and H. White, ``Multilayer feedforward networks are universal approximators,'' Neural Networks, vol. 2, no. 5, pp. 359-366, 1989.

50
J.-M. Hupé, A. C. J. B. R. Payne, , S. G. Lomber, P. Girard, and J. Bullier, ``Cortical feedback improves discrimination between figure and background by v1, v2 and v3 neurons,'' Nature, vol. 394, pp. 784-787, 1998.

51
A. Hyvärinen, ``Fast and robust fixed-point algorithms for independent component analysis,'' IEEE Transactions on Neural Networks, vol. 10, no. 3, pp. 626-634, 1999.

52
A. Hyvärinen, ``Survey on independent component analysis,'' Neural Computing Surveys, vol. 2, pp. 94-128, 1999.

53
A. Hyvärinen and P. O. Hoyer, ``Emergence of phase and shift invariant features by decomposition of natural images into independent feature subspaces,'' Neural Computation, vol. 12, no. 7, pp. 1705-1720, 2000.

54
A. Hyvärinen and P. O. Hoyer, ``Emergence of topography and complex cell properties from natural images using extensions of ICA,'' in Advances in Neural Information Processing Systems 12, NIPS*99, (Denver, Colorado, USA, Nov. 29 - Dec. 4, 1999), pp. 827-833, The MIT Press, 2000.

55
A. Hyvärinen and E. Oja, ``A fast fixed-point algorithm for independent component analysis,'' Neural Computation, vol. 9, no. 7, pp. 1483-1492, 1997.

56
A. Hyvärinen, J. Särelä, and R. Vigário, ``Bumps and spikes: Artifacts generated by independent component analysis with insufficient sample size,'' in Proceedings of the First International Workshop on Independent Component Analysis and Signal Separation, ICA'99, (Aussois, France, Jan. 11-15), pp. 425-429, 1999.

57
E. T. Jaynes, ``Probability theory: The logic of science.'' Available from http://bayes.wustl.edu/etj/prob.html, 1996.

58
I. T. Jolliffe, Principal Component Analysis.
Springer-Verlag, 1986.

59
M. I. Jordan, ed., Learning in Graphical Models.
The MIT Press, 1999.

60
M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul, ``An introduction to variational methods for graphical models,'' in Jordan [59], pp. 105-161, 1999.

61
C. Jutten and J. Herault, ``Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture,'' Signal Processing, vol. 24, pp. 1-10, 1991.

62
E. R. Kandel, J. H. Schwartz, and T. M. Jessell, eds., Principles of Neural Science.
Elsevier, 3rd ed., 1991.

63
M. Kendall, Multivariate Analysis.
Charles Griffin & Co., 1975.

64
D. C. Knill and W. Richards, eds., Perception as Bayesian Inference.
Cambridge University Press, 1996.

65
T. Kohonen, Self-Organizing Maps.
Springer-Verlag, 2nd, extended ed., 1997.

66
T. Kohonen, S. Kaski, and H. Lappalainen, ``Self-organized formation of various invariant-feature filters in the Adaptive-Subspace SOM,'' Neural Computation, vol. 9, no. 6, pp. 1321-1344, 1997.

67
A. N. Kolmogorov, ``Three approaches to the quantitative definition of information,'' Problems of Information Transmission, vol. 1, pp. 1-17, 1965.
Translated from Problemy Peredachi Informatsii (in Russian).

68
S. M. Kosslyn, W. L. Thompson, I. J. Kim, and N. M. Alpert, ``Topographical representations of mental images in primary visual cortex,'' Nature, vol. 378, pp. 496-498, 1995.

69
S. W. Kuffler, J. G. Nicholls, and A. R. Martin, From Neuron to Brain.
Sinauer Associates Inc. Publishers, 2nd ed., 1984.

70
S. Kullback and R. A. Leibler, ``On information and sufficiency,'' The Annals of Mathematical Statistics, vol. 22, pp. 79-86, 1951.

71
P. S. Laplace, ``Mémoire sur la probabilité des causes par les événements,'' Mémoires de l'Académie Royale des Sciences, vol. 6, pp. 621-656, 1774.
English translation in [123].

72
H. Lappalainen, ``Fast fixed-point algorithms for Bayesian blind source separation,'' Publications in Computer and Information Science A56, Helsinki University of Technology, Espoo, Finland, 1999.

73
S. Lauritzen, ed., Graphical Models.
Oxford University Press, 1996.

74
D. D. Lee and H. S. Seung, ``Unsupervised learning by convex and conic coding,'' in Advances in Neural Information Processing Systems 9, NIPS*96, (Denver, Colorado, USA, Nov. 2-5, 1996), pp. 515-521, The MIT Press, 1997.

75
P. M. Lee, Bayesian Statistics: An Introduction.
Oxford University Press, 1989.

76
T.-W. Lee, Independent Component Analysis -- Theory and Applications.
Kluwer, 1998.

77
L. A. Levin, ``Universal sequential search problems,'' Problems of Information Transmission, vol. 9, no. 3, pp. 256-266, 1973.

78
M. Li and P. M. B. Vitányi, An Introduction to Kolmogorov Complexity and its Applications.
Springer-Verlag, 2nd, extended ed., 1997.

79
J. K. Lin, D. Grier, and J. D. Cowan, ``Faithful representation of separable input distribution,'' Neural Computation, vol. 9, no. 6, pp. 1305-1320, 1997.

80
W. Maass and C. M. Bishop, eds., Pulsed Neural Networks.
The MIT Press, 1999.

81
D. J. C. MacKay, ``A practical Bayesian framework for backpropagation networks,'' Neural Computation, vol. 4, no. 3, pp. 448-472, 1992.

82
D. J. C. MacKay, ``Developments in probabilistic modelling with neural networks--ensemble learning,'' in Neural Networks: Artificial Intelligence and Industrial Applications. Proceedings of the 3rd Annual Symposium on Neural Networks, (Nijmegen, Netherlands, Sep. 14-15), pp. 191-198, Springer-Verlag, 1995.

83
D. J. C. MacKay, ``Ensemble learning for hidden Markov models.'' Available from http://wol.ra.phy.cam.ac.uk/, 1997.

84
D. J. C. MacKay, ``Choice of basis for laplace approximation,'' Machine Learning, vol. 33, no. 1, pp. 77-86, 1998.

85
D. J. C. MacKay and M. N. Gibbs, ``Density networks,'' in Proceedings of Society for General Microbiology Edinburgh Meeting, 1997.

86
G. C. Marques and L. B. Almeida, ``An objective function for independence,'' in Proceedings of the International Conference on Neural Networks, ICNN'96, (Washington, DC, USA, June 3-6), pp. 453-457, 1996.

87
G. C. Marques and L. B. Almeida, ``Separation of nonlinear mixtures using pattern repulsion,'' in Proceedings of the First International Workshop on Independent Component Analysis and Signal Separation, ICA'99, (Aussois, France, Jan. 11-15), pp. 277-282, 1999.

88
P. S. Maybeck, Stochastic Models, Estimation, and Control, vol. 1.
Academic Press, 1979.

89
G. J. McLachlan and K. E. Basford, Mixture Models. Inference and Applications to Clustering.
Marcel Dekker, 1988.

90
J. Moody and C. Darken, ``Fast learning in networks of locally-tuned processing units,'' Neural Computation, vol. 1, no. 2, pp. 281-294, 1989.

91
R. M. Neal, ``Connectionist learning of belief networks,'' Artificial Intelligence, vol. 56, no. 1, pp. 71-113, 1992.

92
R. M. Neal, Bayesian Learning for Neural Networks.
No. 118 in Lecture Notes in Statistics, Springer-Verlag, 1996.

93
R. M. Neal and G. E. Hinton, ``A view of the EM algorithm that justifies incremental, sparse, and other variants,'' in Jordan [59], pp. 355-368, 1999.

94
J.-H. Oh and H. S. Seung, ``Learning generative models with the up-propagation algorithm,'' in Advances in Neural Information Processing Systems 10, NIPS*97, (Denver, Colorado, USA, Dec. 1-6, 1997), pp. 605-611, The MIT Press, 1998.

95
E. Oja, ``The nonlinear PCA learning rule in independent component analysis,'' Neurocomputing, vol. 17, no. 1, pp. 25-46, 1997.

96
J. J. Oliver and R. A. Baxter, ``MML and Bayesianism: Similarities and differences,'' Tech. Rep. TR 206, Department of Computer Science, Monash University, Australia, 1994.

97
J. J. Oliver and D. J. Hand, ``Introduction to minimum encoding inference,'' Tech. Rep. TR 205, Department of Computer Science, Monash University, Australia, 1994.

98
P. Pajunen, ``Nonlinear independent component analysis by self-organizing maps,'' in Proceedings of the Sixth International Conference on Artificial Neural Networks, ICANN'96, (Bochum, Germany, July 16-19), pp. 815-819, 1996.

99
P. Pajunen, ``Blind source separation using algorithmic information theory,'' Neurocomputing, vol. 22, pp. 35-48, 1998.

100
P. Pajunen and J. Karhunen, ``A maximum likelihood approach to nonlinear blind source separation,'' in Proceedings of the Seventh International Conference on Artificial Neural Networks, ICANN'97, (Lausanne, Switzerland, Oct. 8-10), pp. 541-546, 1997.

101
J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference.
Morgan-Kaufman, 1988.

102
J. W. Pratt, H. Raiffa, and R. O. Schlaifer, Introduction to Statistical Decision Theory.
The MIT Press, 1995.

103
S. J. Press, Bayesian Statistics: Principles, Models, and Applications.
Wiley, 1989.

104
P. Rakic and W. Singer, eds., Neurobiology of Neocortex.
John Wiley & Sons, 1988.

105
R. P. N. Rao and D. H. Ballard, ``Kalman filter model of the visual cortex,'' Neural Computation, vol. 9, no. 4, pp. 721-763, 1997.

106
J. Rissanen, ``Modeling by shortest data description,'' Automatica, vol. 14, no. 5, pp. 465-471, 1978.

107
J. Rissanen, ``Fisher information and stochastic complexity,'' IEEE Transactions on Information Theory, vol. 42, no. 1, pp. 40-47, 1996.

108
J. Rissanen and G. G. Langdon, Jr., ``Arithmetic coding,'' IBM Journal of Research and Development, vol. 23, no. 2, pp. 149-162, 1979.

109
J. Rissanen and G. G. Langdon, Jr., ``Universal modeling and coding,'' IEEE Transactions on Information Theory, vol. 27, pp. 12-23, 1981.

110
D. Rubin and D. Thayer, ``EM algorithms for factor analysis,'' Psychometrika, vol. 47, pp. 69-76, 1982.

111
D. E. Rumelhart, G. E. Hinton, and R. J. Williams, ``Learning internal representations by error backpropagation,'' in Parallel distributed processing (D. E. Rumelhart and J. L. McClelland, eds.), vol. 1, pp. 318-362, The MIT Press, 1986.

112
S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach.
Prentice-Hall, 1995.

113
L. K. Saul, T. Jaakkola, and M. I. Jordan, ``Mean field theory for sigmoid belief networks,'' Journal of Artificial Intelligence Research, vol. 4, pp. 61-76, 1996.

114
L. J. Savage, The Foundations of Statistics.
Dover Publications, 1954.

115
M. J. Schervish, Theory of Statistics.
Springer-Verlag, 1995.

116
J. Schmidhuber, ``Discovering neural nets with low Kolmogorov complexity and high generalization capability,'' Neural Networks, vol. 10, no. 5, pp. 857-873, 1997.

117
C. E. Shannon, ``A mathematical theory of communication,'' Bell System Technical Journal, vol. 27, pp. 379-423 and 623-656, 1948.

118
R. H. Shumway and D. S. Stoffer, ``An approach to time series smoothing and forecasting using the EM algorithm,'' Journal of Time Series Analysis, vol. 3, no. 4, pp. 253-264, 1982.

119
R. J. Solomonoff, ``A formal theory of inductive inference. Part I,'' Information and Control, vol. 7, no. 1, pp. 1-22, 1964.

120
R. J. Solomonoff, ``A formal theory of inductive inference. Part II,'' Information and Control, vol. 7, no. 2, pp. 224-254, 1964.

121
H. W. Sorenson, ed., Kalman Filtering: Theory and Application.
IEEE Press, 1985.

122
C. Spearman, ````General intelligence,'' objectively determined and measured,'' American Journal of Psychology, vol. 15, pp. 201-293, 1904.

123
S. M. Stigler, ``Translation of Laplace's 1774 memoir on ``Probability of causes'','' Statistical Science, vol. 1, no. 3, pp. 359-378, 1986.

124
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction.
The MIT Press, 1998.

125
A. Taleb and C. Jutten, ``Nonlinear source separation: The post-nonlinear mixtures,'' in Proceedings of the European Symposium on Artificial Neural Networks, ESANN'97, (Bruges, Belgium, Apr. 16-18), pp. 279-284, 1997.

126
A. Taleb and C. Jutten, ``Source separation in post-nonlinear mixtures,'' IEEE Transactions on Signal Processing, vol. 47, no. 10, pp. 2807-2820, 1999.

127
K. Tanaka, ``Inferotemporal cortex and object vision,'' Annual Reviews in Neuroscience, vol. 10, pp. 109-139, 1996.

128
H. Valpola, X. Giannakopoulos, A. Honkela, and J. Karhunen, ``Nonlinear independent component analysis using ensemble learning: Experiments and discussion,'' in Proceedings of the Second International Workshop on Independent Component Analysis and Blind Signal Separation, ICA 2000, (Helsinki, Finland, June 19-22), pp. 351-356, 2000.

129
A. Wald, Statistical Decision Functions.
Wiley, 1950.

130
C. S. Wallace and D. M. Boulton, ``An information measure for classification,'' Computer Journal, vol. 11, no. 2, pp. 185-194, 1968.

131
C. S. Wallace and P. R. Freeman, ``Estimation and inference by compact coding,'' Journal of the Royal Statistical Society (Series B), vol. 49, no. 3, pp. 240-265, 1987.

132
J. E. Whitesitt, Boolean Algebra and Its Applications.
Dover Publications, 1995.

133
R. R. Yager and L. A. Zadeh, An Introduction to Fuzzy Logic Applications in Intelligent Systems.
Kluwer Academic Publishers, 1992.

134
H. H. Yang, S. Amari, and A. Cichocki, ``Information back-propagation for blind separation of sources from non-linear mixtures,'' in Proceedings of the International Conference on Neural Networks, ICNN'97, (Houston, Texas, USA, June 9-12), 1997.

135
H. H. Yang, S. Amari, and A. Cichocki, ``Information-theoretic approach to blind separation of sources in non-linear mixture,'' Signal Processing, vol. 64, pp. 291-300, 1998.

136
L. Zadeh, ``Fuzzy sets,'' Information and Control, vol. 8, pp. 338-353, 1965.



Harri Valpola
2000-10-31