Third Course ``Machine Learning: Advanced Probabilistic Methods''

Next: Dual Degrees Up: Our Machine Learning Courses Previous: Second Course ``Machine Learning

Third Course ``Machine Learning: Advanced Probabilistic Methods''

This course is the most advanced course in the Macadamia program and builds on the previous courses covered in earlier sections. This gives the opportunity to rely on a certain background knowledge taught on the prerequisite courses and concentrate on probabilistic methods without the need to recapitulate on the underlying basics. The textbook used on the course is the Bishop's recent book [Bishop, 2006] on pattern recognition and machine learning. During the course, a subset of five chapters are covered, the textbook is complemented with additional material when needed.

The main topic of the course is probabilistic inference and learning in the context of machine learning, with special emphasis on the framework of graphical models. The course is started with the presentation of Bayesian networks and the Bayes's theorem as the solution to "answer questions", that is, to perform inference using the model and the available evidence. After the rather general presentation on Bayesian networks and the algorithmic possibilities of performing exact inference, the course covers mixture models and the EM algorithm. The order of the material has been designed in the hope that the students realize that mixture models are "just simple Bayesian networks", so the earlier material applies here as well. Naturally, we enforce this view in the lectures and the exercises actively. The EM algorithm is first presented in the context of mixture models, which makes the presentation easier than in the most general settings. First, the EM algorithm is presented as is, and on the later lectures the principles of derivation are presented. The course continues on the topic of models for sequential data, again with the emphasis on the models being "just a little bit more complex Bayesian networks". Models building on Markov-chains, such as Hidden Markov models and the possibilities for extensions are presented. Up to this point, exact inference and the EM algorithm have been the the main tools. Towards the end of the course, approximate inference with sampling and variational algorithms are presented. This is backed up earlier material in the Bayesian networks framework, where the possible reasons for the difficulty or even infeasibility of inference were reviewed.

The course contents and the organization has been designed by Jaakko Hollmén, also he lectured the course. The practical pen-and-paper course exercises have been designed together with Tapani Raiko, who led the exercise sessions. The ultimate goal behind the design of the course is to teach the fundamental principles behind probabilistic inference, that is, the Bayes's theorem and the computational principles in its execution in different models, and to apply the principles in model construction and learning from data. Whereas the principles are of great importance, even more important is to tie together the world of principals with the pragmatic implementations and real-world data sets. Towards this aim, practical implementations in a concrete machine learning scenario are presented during the lectures and exercises. Moreover, a term project on the topic of mixture modeling is given. Contrary to our earlier standard of giving programming exercises to the students, a full software package BernoulliMix has been used. In the term project, the students concentrate on the modeling aspects, and thinking of results instead of "just getting the programs to work". The design of the exercise and the contents of the BernoulliMix program package is reviewed in an independent article in the current volume, see [Hollmén & Raiko, 2008]. The BernoulliMix home page is at http://www.cis.hut.fi/jhollmen/BernoulliMix, where the manual including the exercises may be found.

The preliminary experiences about the course running for the first time during the Spring 2008 have been positive. During the last lecture, the exam requirements were presented, and lots of effort was put on relating all the taught material to each other and emphasizing important topics. This was especially appreciated among the students in their free-form feedback after the last lecture, they hoped for this kind of relating even more in the beginning parts on the course. Paradoxically, in the beginning phases of the course, the teacher may only related the material learned so far (but some careful forward references to coming material and topics may be done). This can be seen as a positive feedback on the used framework relying on graphical models, or more specifically Bayesian networks and them placing all the subsequent material in that framework.

The course material, including lecture notes, problems and solutions to the exercises may be found on the course home page at http://www.cis.hut.fi/Opinnot/T-61.5140/.

Next: Dual Degrees Up: Our Machine Learning Courses Previous: Second Course ``Machine Learning

Tapani Raiko 2008-06-02