Mari-Sanna Paukkeri and Timo Honkela. Likey: Unsupervised Language-Independent Keyphrase Extraction. In Proceedings of the 5th International Workshop on Semantic Evaluation (SemEval), pages 162–165, Uppsala, Sweden, July 2010. Association for Computational Linguistics.


Likey is an unsupervised statistical approach for keyphrase extraction. The method is language-independent and the only language-dependent component is the reference corpus with which the documents to be analyzed are compared. In this study, we have also used another language-dependent component: an English-specific Porter stemmer as a pre-processing step. In our experiments of keyphrase extraction from scientific articles, the Likey method outperforms both supervised and unsupervised baseline methods.

Suggested BibTeX entry:

    address = {Uppsala, Sweden},
    author = {Mari-Sanna Paukkeri and Timo Honkela},
    booktitle = {Proceedings of the 5th International Workshop on Semantic Evaluation (SemEval)},
    month = {July},
    pages = {162--165},
    publisher = {Association for Computational Linguistics},
    title = {{Likey: Unsupervised Language-Independent Keyphrase Extraction}},
    year = {2010},

See ...