Suomeksi
References
-
H. Attias
Hierarchical ICA belief networks.
In M. S. Kearns, S. A. Solla, and D. A. Cohn, editors,
NIPS 11, 1999. In press.
-
H. Attias
Independent factor analysis.
Neural Computation,
11(4):803-851, 1999.
[Post
Script (674 kb)]
-
D. Barber and
C. M. Bishop.
Ensemble learning for multi-layer networks.
In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors,
NIPS 10, pages 395-401, 1998.
The MIT Press.
[Abstract and Post Script]
-
D. Barber and
B. Schottky.
Radial basis functions: a bayesian treatement.
In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors,
NIPS 10, pages 402-408, 1998.
The MIT Press.
[Abstract and Post Script]
-
C. M. Bishop,
N. Lawrence,
T. Jaakkola and
M. I. Jordan.
Approximating posterior distributions in belief networks using
mixtures.
In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors,
NIPS 10, pages 416-422, 1998.
The MIT Press.
-
Z. Ghahramani and
G. E. Hinton.
Hierarchical non-linear factor analysis and topographic maps.
In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors,
NIPS 10, pages 486-492, 1998.
The MIT Press.
[Post
Script]
-
G. E. Hinton and
D. van Camp.
Keeping neural networks simple by minimizing the description
length of the weights.
In Proceedings of the COLT'93, pages 5-13, Santa Cruz,
California, 1993.
[Post Script
(744 kb)], index terms and [PDF]
-
G. E. Hinton and
Z. Ghahramani.
Generative Models for Discovering Sparse Distributed Representations.
Philosophical Transactions Royal Society B, 354:117-1190.
[Post Script]
-
G. E. Hinton and
R. S. Zemel.
Autoencoders, minimum description length and Helmholz free energy.
In Jack D. Cowan, Gerald Tesauro, and Joshua Alspector, editors,
NIPS 6, pages 3-10, San Francisco, 1994. Morgan Kaufmann.
-
S. Hochreiter and
J. Schmidhuber.
Flat minima.
Neural Computation
, 9(1):1-42, January 1997.
[PDF]
-
S. Hochreiter and
J. Schmidhuber.
LOCOCODE performs nonlinear ICA without knowing the number of sources.
In Proceedings of the ICA'99, pages 149-154, Aussois, France,
1999.
-
H. Lappalainen.
Using an MDL-based cost function with neural networks
In Proceedings of the IJCNN'98, pages 2384-2389, Anchorage,
Alaska, 1998.
[HTML], [Post Script
(63 kb)]
-
H. Lappalainen.
Ensemble learning for independent component analysis.
In Proceedings of the ICA'99, pages 7-12, Aussois, France,
1999.
[HTML], [Post Script
(90 kb)]
-
H. Lappalainen and
X. Giannakopoulos.
Multi-layer perceptrons as nonlinear generative models for unsupervised
learning: a Bayesian treatment.
In Proceedings of ICANN'99. Accepted.
[HTML], [Post Script
(126 kb)]
-
D. J. C. MacKay.
Bayesian interpolation.
Neural Computation
, 4:415-447, 1992.
-
D. J. C. MacKay.
A practical Bayesian framework for backpropagation networks.
Neural Computation
, 4:448-472, 1992.
-
D. J. C. MacKay.
The evidence framework applied to classification networks.
Neural Computation
, 4:698-714, 1992.
-
D. J. C. MacKay.
Probable networks and plausible predictions -
a review of practical Bayesian methods for supervised neural networks.
Network 6(3):469-505, 1995.
-
D. J. C. MacKay.
Ensemble learning and evidence maximization.
[Post
Script]
-
D. J. C. MacKay.
Developments in Probabilistic Modelling with Neural Networks -
Ensemble Learning.
In Neural Networks: Artificial Intelligence and Industrial
Applications. Proceedings of the 3rd Annual Symposium on Neural
Networks, Nijmegen, Netherlands, 14-15 September 1995, pages
191-198, Berlin, 1995. Springer.
[Post
Script (45 kb)]
-
D. J. C. MacKay.
Ensemble learning for hidden Markov Models.
Available from
http://wol.ra.phy.cam.ac.uk/, 1997.
[Post
Script (33 kb)]
-
D. J. C. MacKay.
Comparison of approximate methods for handling hyperparameters.
Neural Computation.
Submitted.
[Post
Script]
-
É. Moulines,
J.-F. Cardoso and
E. Gassiat.
Maximum likelihood for blind separation and deconvolution of noisy
signals using mixture models.
In Proceedings of the ICASSP'97, pages 3617-3620, Munich,
Germany, 1997.
[Post
Script]
-
R. M. Neal.
Learning Stochastic Feedforward Networks.
Technical Report CRG-TR-90-7, Dept. of Computer Science,
University of Toronto.
-
J.-H. Oh and
H. S. Seung.
Learning generative models with the up-propagation algorithm.
In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors,
NIPS 10, pages 605-611, 1998.
The MIT Press.
[Post
Script]
-
B. Pfahringer.
Compression-based feature subset selection.
In P. Turney, editor, IJCAI-95 Workshop on Data Engineering for
Inductive Learning. IJCAI-95 Workshop Program Working Notes, Montreal,
Canada, 1995.
-
J. Rissanen.
Modeling by shortest data description.
Automatica, 14:465-471, 1978.
-
J. Rissanen.
A universal prior for integers and estimation by minimum description
length.
Annals of Statistics, 11(2):416-431, 1983.
-
J. Rissanen.
Stochastic complexity.
Journal of the Royal Statistical Society (Series B),
49(3):223-239 and 252-265, 1987.
-
J. Rissanen.
Fisher information and stochastic complexity.
IEEE Transactions on Information Theory, 42(1):40-47, January
1996.
-
J. Rissanen and
G. G. Langdon, Jr.
Universal modeling and coding.
IEEE Transactions on Information Theory, 27:12-23, 1981.
-
L. K. Saul,
T. Jaakkola and
M. I. Jordan.
Mean field theory for sigmoid belief networks.
Journal of Artificial Intelligence Research, 4:61--76, 1996.
[Abstract and Post Script]
-
M. J. Schervish.
Theory of Statistics.
Springer-Verlag, New York, 1995.
-
C. E. Shannon.
A mathematical theory of communication.
Bell System Technical Journal, 27:379-423, July 1948.
-
C. S. Wallace and
D. M. Boulton.
An information measure for classification.
Computer Journal, 11(2):185-194, 1968.
-
C. S. Wallace and
P. R. Freeman.
Estimation and inference by compact coding.
Journal of the Royal Statistical Society (Series B),
49(3):240-265, 1987.
-
R. S. Zemel.
A minimum description length framework for unsupervised
learning.
PhD thesis, University of Toronto, Canada, 1993.
-
R. S. Zemel and
G. E. Hinton.
Developing population codes by minimizing description length.
In Jack D. Cowan, Gerald Tesauro, and Joshua Alspector, editors,
NIPS 6, pages 11-18, San Francisco, 1994. Morgan Kaufmann.
Harri Lappalainen
<Harri.Lappalainen@hut.fi>