Sami Virpioja
D.Sc. (Tech.), Researcher
- Office:
-
Room T-A316 in Computer Science Building,
Konemiehentie 2, Otaniemi campus area, Espoo
- Postal Address:
-
Aalto University School of Science,
Department of Information and Computer Science,
P.O. Box 15400, FI-00076 Aalto, Finland
- Telephone:
- +358 50 4301966
- Email:
- firstname.lastname@aalto.fi
About me
I am interested on how methods of machine learning can be used to
model complex phenomena such as language. Due to the sparsity of
language data, it is relevant to find structures that can be used
to represent the data more efficiently. For an example, see the page
of the
Morpho project and
demonstration of the Morfessor algorithm. My research topics
include also practical applications of statistical language modeling,
especially speech recognition and machine translation.
I participate in the following research groups at Aalto University:
See the complete list of publications or
the selected papers below. Click on a publication title to check
availability and bibtex entry.
Doctoral thesis
-
Sami Virpioja (2012).
- Learning Constructions of Natural Language: Statistical Models and Evaluations. Aalto University, Doctoral dissertations 158/2012.
Journal articles
-
Sami Virpioja,
Mari-Sanna Paukkeri,
Abhishek Tripathi,
Tiina Lindh-Knuutila, and
Krista Lagus (2012).
-
Evaluating Vector Space Models with Canonical Correlation Analysis.
Natural Language Engineering, Volume 18, Issue 3, 2012, pp. 399-436.
-
Sami Virpioja,
Ville T. Turunen,
Sebastian Spiegler,
Oskar Kohonen, and
Mikko Kurimo (2011).
-
Empirical Comparison of Evaluation Methods for Unsupervised Learning of
Morphology.
Traitement Automatique des Langues, Volume 52, Issue 2, 2011,
pp. 45-90.
-
Vesa Siivola,
Teemu Hirsimäki
and Sami Virpioja (2007).
-
On Growing and Pruning Kneser-Ney Smoothed N-Gram Models.
IEEE Transactions on Audio, Speech and Language Processing, Volume 15,
Issue 5, July 2007, pp. 1617-1624.
- Teemu Hirsimäki,
Mathias Creutz,
Vesa Siivola,
Mikko Kurimo, Sami Virpioja and
Janne Pylkkönen (2006).
-
Unlimited Vocabulary Speech Recognition with Morph
Language Models Applied to Finnish.
Computer Speech and Language, Volume 20, Issue 4, October 2006,
pp. 515-541.
Recent conference and workshop papers
-
Sami Virpioja,
Minna Lehtonen,
Annika Hultén,
Riitta Salmelin,
and Krista Lagus (2011).
-
Predicting Reaction Times in Word Recognition by Unsupervised Learning of
Morphology.
In Artificial Neural Networks and Machine Learning --- ICANN 2011,
volume 6791 of Lecture Notes in Computer Science,
pages 275-282. Springer Berlin / Heidelberg, June 2011.
-
Oskar Kohonen, Sami Virpioja, and Krista Lagus (2010).
-
Semi-supervised learning of concatenative morphology.
In Proceedings of the 11th Meeting of the ACL Special Interest Group on
Computational Morphology and Phonology, pages 78-86, Uppsala, Sweden,
July 2010. Association for Computational Linguistics.
-
Mikko Kurimo, Sami Virpioja, Ville Turunen, and Krista Lagus (2010).
-
Morpho challenge 2005-2010: Evaluations and results.
In Proceedings of the 11th Meeting of the ACL Special Interest Group on
Computational Morphology and Phonology, pages 87-95, Uppsala, Sweden,
July 2010. Association for Computational Linguistics.
-
Sami Virpioja, Oskar Kohonen, and Krista Lagus (2010).
-
Unsupervised morpheme analysis with Allomorfessor.
In Multilingual Information Access Evaluation I. Text Retrieval Experiments: 10th Workshop of the Cross-Language Evaluation Forum, CLEF 2009, Corfu, Greece, September 30 - October 2, 2009, Revised Selected Papers,
volume 6241 of Lecture Notes in Computer Science, pages 578-597.
Springer.
-
Adrià de Gispert,
Sami Virpioja,
Mikko Kurimo and
William Byrne (2009).
-
Minimum
Bayes Risk Combination of Translation Hypotheses from Alternative
Morphological Decompositions. In Proceedings of Human Language
Technologies: The 2009 Annual Conference of the North American Chapter
of the Association for Computational Linguistics, Companion Volume:
Short Papers, pages 73-76, Boulder, CO, USA, June 2009.
Association for Computational Linguistics.