The Lunch is Never Free:
How Information Theory, MDL, and Statistics are Connected


Model techniques are becoming increasingly popular in many diverse data mining subfields such as sequence mining, graph mining, and pattern mining. One particularly popular approach, due to its interpretability and practicality, is Minimum Description Length (MDL) principle which is based on information- theoretic approach. In this tutorial we present basic concepts of MDL, Information Theory, and Bayesian Statistics with the emphasis on how they are connected, and what are the consequences of these connections. These connections provide additional insights into MDL principle and information theory, provide a stronger theoretical background, and allow us to use tools from statistics, but also point out limitations that are not immediately apparent.

The tutorial will be given at ECML PKDD 2014, Nancy, France.


Download slides from here.


Will appear later