Sami Virpioja
D.Sc. (Tech.), Researcher
Since October 2016, I have moved to
Department of Signal Processing and
- Office:
Room F407 in Health Technology House,
Rakentajanaukio 2C, Otaniemi campus area, Espoo
- Postal Address:
Aalto University School of Electrical Engineering,
Department of Signal Processing and Acoustics,
P.O. Box 12200, FI-00076 Aalto, Finland
- Email:
Research interests
I am interested on how the methods of machine learning can be used to
model complex phenomena such as language. Due to the sparsity of
language data, it is relevant to find structures that can be used
to represent the data more efficiently. For an example, see the page
of the
Morpho project. My research topics
include also practical applications of statistical language modeling,
especially speech recognition and machine translation.
During my doctoral studies, I participated in
the Computational
Cognitive Systems
and Speech
Recognition research groups.
Journal articles
Stig-Arne Grönroos, Katri Hiovain, Peter Smit, Ilona Rauhala, Kristiina Jokinen, Mikko Kurimo, and Sami Virpioja
Low-Resource Active Learning of Morphological Segmentation.
Northern European Journal of Language Technology, Volume 4, 2016, pp. 47-72.
Teemu Ruokolainen, Oskar Kohonen, Kairit Sirts, Stig-Arne Grönroos, Mikko Kurimo, and Sami Virpioja
A Comparative Study of Minimally Supervised Morphological Segmentation.
Computational Linguistics, Volume 42, Issue 1, March 2016, pp. 91-120.
Sami Virpioja,
Mari-Sanna Paukkeri,
Abhishek Tripathi,
Tiina Lindh-Knuutila, and
Krista Lagus (2012).
Evaluating Vector Space Models with Canonical Correlation Analysis.
Natural Language Engineering, Volume 18, Issue 3, 2012, pp. 399-436.
Sami Virpioja,
Ville T. Turunen,
Sebastian Spiegler,
Oskar Kohonen, and
Mikko Kurimo (2011).
Empirical Comparison of Evaluation Methods for Unsupervised Learning of
Traitement Automatique des Langues, Volume 52, Issue 2, 2011,
pp. 45-90.
Vesa Siivola,
Teemu Hirsimäki
and Sami Virpioja (2007).
On Growing and Pruning Kneser-Ney Smoothed N-Gram Models.
IEEE Transactions on Audio, Speech and Language Processing, Volume 15,
Issue 5, July 2007, pp. 1617-1624.
- Teemu Hirsimäki,
Mathias Creutz,
Vesa Siivola,
Mikko Kurimo, Sami Virpioja and
Janne Pylkkönen (2006).
Unlimited Vocabulary Speech Recognition with Morph
Language Models Applied to Finnish.
Computer Speech and Language, Volume 20, Issue 4, October 2006,
pp. 515-541.
Selected conference and workshop papers
Stig-Arne Grönroos, Sami Virpioja, and Mikko Kurimo (2016).
Hybrid morphological segmentation for phrase-based machine translation.
In Proceedings of the First Conference on Machine Translation,
pages 289-295, Berlin, Germany, August 2016.
Association for Computational Linguistics.
Stig-Arne Grönroos, Sami Virpioja, and Mikko Kurimo (2015).
Tuning phrase-based segmented translation for a morphologically complex target language.
In Proceedings of the Tenth Workshop on Statistical Machine Translation, pages 105-111, Lisbon, Portugal, September 2015. Association for Computational Linguistics.
Stig-Arne Grönroos, Sami Virpioja, Peter Smit, and Mikko Kurimo (2015).
Morfessor FlatCat: An HMM-based method for unsupervised and semi-supervised learning of morphology.
In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 1177-1185, Dublin, Ireland, August 2014. Dublin City University and Association for Computational Linguistics.
Teemu Ruokolainen,
Oskar Kohonen, Sami Virpioja, and Mikko Kurimo (2014).
Painless semi-supervised morphological segmentation using conditional random fields.
In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers, pages 84-89, Gothenburg, Sweden, April 2014. Association for Computational Linguistics.
Teemu Ruokolainen,
Oskar Kohonen, Sami Virpioja, and Mikko Kurimo (2013).
Supervised morphological segmentation in a low-resource learning setting using conditional random fields.
In Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pages 29-37, Sofia, Bulgaria, August 2013. Association for Computational Linguistics.
Sami Virpioja,
Minna Lehtonen,
Annika Hultén,
Riitta Salmelin,
and Krista Lagus (2011).
Predicting Reaction Times in Word Recognition by Unsupervised Learning of
In Artificial Neural Networks and Machine Learning --- ICANN 2011,
volume 6791 of Lecture Notes in Computer Science,
pages 275-282. Springer Berlin / Heidelberg, June 2011.
Oskar Kohonen, Sami Virpioja, and Krista Lagus (2010).
Semi-supervised learning of concatenative morphology.
In Proceedings of the 11th Meeting of the ACL Special Interest Group on
Computational Morphology and Phonology, pages 78-86, Uppsala, Sweden,
July 2010. Association for Computational Linguistics.
Mikko Kurimo, Sami Virpioja, Ville Turunen, and Krista Lagus (2010).
Morpho challenge 2005-2010: Evaluations and results.
In Proceedings of the 11th Meeting of the ACL Special Interest Group on
Computational Morphology and Phonology, pages 87-95, Uppsala, Sweden,
July 2010. Association for Computational Linguistics.
Sami Virpioja, Oskar Kohonen, and Krista Lagus (2010).
Unsupervised morpheme analysis with Allomorfessor.
In Multilingual Information Access Evaluation I. Text Retrieval Experiments: 10th Workshop of the Cross-Language Evaluation Forum, CLEF 2009, Corfu, Greece, September 30 - October 2, 2009, Revised Selected Papers,
volume 6241 of Lecture Notes in Computer Science, pages 578-597.
Doctoral thesis
Sami Virpioja (2012).
- Learning Constructions of Natural Language: Statistical Models and Evaluations. Aalto University, Doctoral dissertations 158/2012.