First Course ``Machine Learning: Basic Principles''

Next: Second Course ``Machine Learning Up: Our Machine Learning Courses Previous: Our Machine Learning Courses

First Course ``Machine Learning: Basic Principles''

One of our objectives in constructing the curriculum of the Macadamia was to modify the previous curriculum to better fit the contemporary topics. For this purpose two courses focusing on neural networks were condensed to one, and one advanced course on statistical learning methods was replaced by two courses: the introductory course ``Machine Learning: Basic Principles'' and advanced course ``Machine Learning: Advanced Probabilistic Methods''.

After the introductory course, lectured and constructed by Kai Puolamäki, the student should be able to apply the basic methods to real world data, understand the basic principles of the methods and have necessary prerequisites to understand and apply new concepts and methods that build on the topics covered in the course. As a prior knowledge we require the basic mathematics and probability courses, basics of algorithms and the basic programming courses.

The course should also be sufficiently interesting to attract gifted individual for the more advanced courses, but at the same time, be useful also for those for whom it is the only course on machine learning.

We also wanted to have a text book to avoid the situation where the course material would be too scattered; after a long consideration, we chose Alpaydin (2004) as a text book. The purpose of the course was to give emphasis on the principles of machine learning and probabilistic reasoning, and avoid introducing all possible methods.

The course was designed modular, with lectures, problem sessions that took place during the lecturing periods and term projects. At each stage the students were required to apply the knowledge to the real world data sets.

The term project consisted of a non-trivial classification task where the students had to classify web sites as spam given the WEBSPAM-UK2006 collection data. The term project was organized in the form of challenge, with nominal prizes awarded to the best performing teams, who also presented their work during a mini-workshop that took place during one of the problem sessions. We advised the students to favour simple and understandable methods, and not even try fancy approaches that can be found from the literature.

We collected extensive course feedback. The term project was considered by the students challenging and it was appreciated, especially the fact that it was about a ``real problem'', not just some toy data. The term project made it possible to apply the principles and methods learned in the course into practice. Probably also the challenge format and the mini-workshop were successful as the students saw various approaches to the same problem they had been trying to tackle with.

One source of criticism was that the course had lots of content. One thing for the future is probably to re-consider whether we should drop something out and study some issues more in detail.

Another source of criticism was that many of the students did not have that much experience in using machine learning software, for some, even importing a data set into analysis software was a challenge. The course did not teach the use of data analysis software, and it was designed language or software independent, although the example codes were given in GNU R.

All course materials, including the LaTeX source code of the slides, are available from the course web site at http://www.cis.hut.fi/Opinnot/T-61.3050/.

Next: Second Course ``Machine Learning Up: Our Machine Learning Courses Previous: Our Machine Learning Courses

Tapani Raiko 2008-06-02