In English

Viitteitä

  1. H. Attias
    Hierarchical ICA belief networks.
    In M. S. Kearns, S. A. Solla, and D. A. Cohn, editors, NIPS 11, 1999. Painossa.
  2. H. Attias
    Independent factor analysis.
    Neural Computation, 11(4):803-851, 1999.
    [Post Script (674 kb)]
  3. D. Barber ja C. M. Bishop.
    Ensemble learning for multi-layer networks.
    In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors, NIPS 10, pages 395-401, 1998. The MIT Press.
    [Abstrakti ja Post Script]
  4. D. Barber ja B. Schottky.
    Radial basis functions: a bayesian treatement.
    In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors, NIPS 10, pages 402-408, 1998. The MIT Press.
    [Abstrakti ja Post Script]
  5. C. M. Bishop, N. Lawrence, T. Jaakkola ja M. I. Jordan.
    Approximating posterior distributions in belief networks using mixtures.
    In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors, NIPS 10, pages 416-422, 1998. The MIT Press.
  6. Z. Ghahramani ja G. E. Hinton.
    Hierarchical Nonlinear Factor Analysis and Topographic Maps.
    In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors, NIPS 10, pages 486-492, 1998.
    [Post Script]
  7. G. E. Hinton ja D. van Camp.
    Keeping neural networks simple by minimizing the description length of the weights.
    In Proceedings of the COLT'93, pages 5-13, Santa Cruz, California, 1993.
    [Post Script (744 kb)], indeksi termit ja [PDF]
  8. G. E. Hinton ja Z. Ghahramani.
    Generative Models for Discovering Sparse Distributed Representations.
    Philosophical Transactions Royal Society B, 354:117-1190.
    [Post Script]
  9. G. E. Hinton ja R. S. Zemel.
    Autoencoders, minimum description length and Helmholz free energy.
    In Jack D. Cowan, Gerald Tesauro, and Joshua Alspector, editors, NIPS 6, pages 3-10, San Francisco, 1994. Morgan Kaufmann.
  10. S. Hochreiter ja J. Schmidhuber.
    Flat minima.
    Neural Computation , 9(1):1-42, January 1997.
    [PDF]
  11. S. Hochreiter ja J. Schmidhuber.
    LOCOCODE performs nonlinear ICA without knowing the number of sources.
    In Proceedings of the ICA'99, pages 149-154, Aussois, France, 1999.
  12. H. Lappalainen.
    Using an MDL-based cost function with neural networks
    In Proceedings of the IJCNN'98, pages 2384-2389, Anchorage, Alaska, 1998.
    [HTML], [Post Script (63 kb)]
  13. H. Lappalainen.
    Ensemble learning for independent component analysis.
    In Proceedings of the ICA'99, pages 7-12, Aussois, France, 1999.
    [HTML], [Post Script (90 kb)]
  14. H. Lappalainen ja X. Giannakopoulos.
    Multi-layer perceptrons as nonlinear generative models for unsupervised learning: a Bayesian treatment.
    In Proceedings of ICANN'99. Hyväksytty julkaistavaksi.
    [HTML], [Post Script (126 kb)]
  15. D. J. C. MacKay.
    Bayesian interpolation.
    Neural Computation , 4:415-447, 1992.
  16. D. J. C. MacKay.
    A practical Bayesian framework for backpropagation networks.
    Neural Computation , 4:448-472, 1992.
  17. D. J. C. MacKay.
    The evidence framework applied to classification networks.
    Neural Computation , 4:698-714, 1992.
  18. D. J. C. MacKay.
    Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks.
    Network 6(3):469-505, 1995.
  19. D. J. C. MacKay.
    Ensemble learning and evidence maximization.
    [Post Script]
  20. D. J. C. MacKay.
    Developments in Probabilistic Modelling with Neural Networks - Ensemble Learning.
    In Neural Networks: Artificial Intelligence and Industrial Applications. Proceedings of the 3rd Annual Symposium on Neural Networks, Nijmegen, Netherlands, 14-15 September 1995, pages 191-198, Berlin, 1995. Springer.
    [Post Script (45 kb)]
  21. D. J. C. MacKay.
    Ensemble learning for hidden Markov Models.
    Available from http://wol.ra.phy.cam.ac.uk/, 1997.
    [Post Script (33 kb)]
  22. D. J. C. MacKay.
    Comparison of approximate methods for handling hyperparameters.
    Neural Computation. Submitted.
    [Post Script]
  23. É. Moulines, J.-F. Cardoso ja E. Gassiat.
    Maximum likelihood for blind separation and deconvolution of noisy signals using mixture models.
    In Proceedings of the ICASSP'97, pages 3617-3620, Munich, Germany, 1997.
    [Post Script]
  24. R. M. Neal.
    Learning Stochastic Feedforward Networks.
    Technical Report CRG-TR-90-7, Dept. of Computer Science, University of Toronto.
  25. J.-H. Oh ja H. S. Seung.
    Learning generative models with the up-propagation algorithm.
    In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors, NIPS 10, pages 605-611, 1998. The MIT Press.
    [Post Script]
  26. B. Pfahringer.
    Compression-based feature subset selection.
    In P. Turney, editor, IJCAI-95 Workshop on Data Engineering for Inductive Learning. IJCAI-95 Workshop Program Working Notes, Montreal, Canada, 1995.
  27. J. Rissanen.
    Modeling by shortest data description.
    Automatica, 14:465-471, 1978.
  28. J. Rissanen.
    A universal prior for integers and estimation by minimum description length.
    Annals of Statistics, 11(2):416-431, 1983.
  29. J. Rissanen.
    Stochastic complexity.
    Journal of the Royal Statistical Society (Series B), 49(3):223-239 ja 252-265, 1987.
  30. J. Rissanen.
    Fisher information and stochastic complexity.
    IEEE Transactions on Information Theory, 42(1):40-47, January 1996.
  31. J. Rissanen ja G. G. Langdon, Jr.
    Universal modeling and coding.
    IEEE Transactions on Information Theory, 27:12-23, 1981.
  32. L. K. Saul, T. Jaakkola ja M. I. Jordan.
    Mean field theory for sigmoid belief networks.
    Journal of Artificial Intelligence Research, 4:61--76, 1996.
    [Abstrakti ja Post Script]
  33. M. J. Schervish.
    Theory of Statistics.
    Springer-Verlag, New York, 1995.
  34. C. E. Shannon.
    A mathematical theory of communication.
    Bell System Technical Journal, 27:379-423, July 1948.
  35. C. S. Wallace ja D. M. Boulton.
    An information measure for classification.
    Computer Journal, 11(2):185-194, 1968.
  36. C. S. Wallace ja P. R. Freeman.
    Estimation and inference by compact coding.
    Journal of the Royal Statistical Society (Series B), 49(3):240-265, 1987.
  37. R. S. Zemel.
    A minimum description length framework for unsupervised learning.
    PhD thesis, University of Toronto, Canada, 1993.
  38. R. S. Zemel ja G. E. Hinton.
    Developing population codes by minimizing description length.
    In Jack D. Cowan, Gerald Tesauro, and Joshua Alspector, editors, NIPS 6, pages 11-18, San Francisco, 1994. Morgan Kaufmann.


Harri Lappalainen
<Harri.Lappalainen@hut.fi>