dense.py version 1.0                      -*- coding: iso-8859-1 -*-

The dense.py program finds dense itemsets, as described in the
publications

    Jouni K. Seppnen and Heikki Mannila. Dense Itemsets. In Ronny
    Kohavi, Johannes Gehrke, William DuMouchel, and Joydeep Ghosh,
    eds., Tenth ACM SIGKDD International Conference on Knowledge
    Discovery and Data Mining (KDD-2004), pp. 683-688, Seattle, WA,
    USA 2004.

and

    Jouni K. Seppnen. Using and extending itemsets in data mining:
    query approximation, dense itemsets, and tiles. Doctoral
    dissertation, Department of Computer Science and Engineering,
    Helsinki University of Technology, 2006.

The first publication is available (sadly, not open-access) at
http://portal.acm.org/citation.cfm?id=1014140 and the second at
http://lib.tkk.fi/Diss/2006/isbn951228202X/

The program is Copyright (C) 2005,2006 Jouni K. Seppnen, and is
distributed under the Boost Software License, Version 1.0. See the
accompanying file LICENSE. If you make use of the software when
composing a scientific publication, you should cite this article; 
this is not a requirement of the license but one of ethical 
scientific conduct.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
LICENSE file for more details.

USING
=====

Name your attributes/items as consecutive integers starting from zero,
and format your data as newline-separated lines of whitespace-
separated numbers. Then type

  ./dense.py -n 100 -s 10 filename 

to obtain the 100 maximal-density itemsets with support 10 (integer
number of lines, not fraction of total), or

  ./dense.py -n 100 -d 0.8 filename 

to obtain the 100 maximal-support itemsets with density 0.8, or

  ./dense.py -s 10 -d 0.8 filename

to obtain the itemsets with density at least 0.8 at support at least
10. For the last task, the C++ implementation (dense.cc) is probably
better, but this Python implementation may be more useful for
experimentation.
