Publications by Sami Virpioja

[Home page]

2016

48Stig-Arne Grönroos, Sami Virpioja, and Mikko Kurimo. Hybrid morphological segmentation for phrase-based machine translation. In Proceedings of the First Conference on Machine Translation, pages 289–295, Berlin, Germany, August 2016. Association for Computational Linguistics.
Info
See www.aclweb.org ...
47Teemu Ruokolainen, Oskar Kohonen, Kairit Sirts, Stig-Arne Grönroos, Mikko Kurimo, and Sami Virpioja. A comparative study of minimally supervised morphological segmentation. Computational Linguistics, 42(1):91–120, 2016.
Info
See dx.doi.org ...
46Matti Varjokallio, Sami Virpioja, and Mikko Kurimo. Class n-gram models for very large vocabulary speech recognition of Finnish and Estonian. In Pavel Král and Carlos Martín-Vide, editors, Statistical Language and Speech Processing: 4th International Conference, SLSP 2016, Pilsen, Czech Republic, October 11-12, 2016, Proceedings, volume 9918 of Lecture Notes in Computer Science, pages 133–144. Springer International Publishing, 2016.
Info
See dx.doi.org ...

2015

45Stig-Arne Grönroos, Sami Virpioja, and Mikko Kurimo. Tuning phrase-based segmented translation for a morphologically complex target language. In Proceedings of the Tenth Workshop on Statistical Machine Translation, pages 105–111, Lisbon, Portugal, September 2015. Association for Computational Linguistics.
Info
See aclweb.org ...
44Sami Virpioja and Stig-Arne Grönroos. LeBLEU: N-gram-based translation evaluation score for morphologically complex languages. In Proceedings of the Tenth Workshop on Statistical Machine Translation, pages 411–416, Lisbon, Portugal, September 2015. Association for Computational Linguistics.
Info
See aclweb.org ...
43Stig-Arne Grönroos, Kristiina Jokinen, Katri Hiovain, Mikko Kurimo, and Sami Virpioja. Low-resource active learning of North Sámi morphological segmentation. In Tommi A. Pirinen, Francis M. Tyers, and Trond Trosterud, editors, Proceedings of the 1st International Workshop on Computational Linguistics for Uralic Languages (IWCLUL 2015), pages 20–33, Tromsø, Norway, 2015. The University Library of Tromsø.
Info
See septentrio.uit.no ...

2014

42Stig-Arne Grönroos, Sami Virpioja, Peter Smit, and Mikko Kurimo. Morfessor FlatCat: An HMM-based method for unsupervised and semi-supervised learning of morphology. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 1177–1185, Dublin, Ireland, August 2014. Dublin City University and Association for Computational Linguistics.
Info
See www.aclweb.org ...
41Teemu Ruokolainen, Oskar Kohonen, Sami Virpioja, and Mikko Kurimo. Painless semi-supervised morphological segmentation using conditional random fields. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers, pages 84–89, Gothenburg, Sweden, April 2014. Association for Computational Linguistics.
Info
See www.aclweb.org ...
40Peter Smit, Sami Virpioja, Stig-Arne Grönroos, and Mikko Kurimo. Morfessor 2.0: Toolkit for statistical morphological segmentation. In Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics, pages 21–24, Gothenburg, Sweden, April 2014. Association for Computational Linguistics.
Info
See www.aclweb.org ...

2013

39Teemu Ruokolainen, Oskar Kohonen, Sami Virpioja, and Mikko Kurimo. Supervised morphological segmentation in a low-resource learning setting using conditional random fields. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pages 29–37, Sofia, Bulgaria, August 2013. Association for Computational Linguistics.
Info
See www.aclweb.org ...
38Matti Varjokallio, Mikko Kurimo, and Sami Virpioja. Learning a subword vocabulary based on unigram likelihood. In Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2013), 2013.
PDF (81 kB)
Info
37Sami Virpioja, Peter Smit, Stig-Arne Grönroos, and Mikko Kurimo. Morfessor 2.0: Python implementation and extensions for Morfessor Baseline. Report 25/2013 in Aalto University publication series SCIENCE + TECHNOLOGY, Department of Signal Processing and Acoustics, Aalto University, Helsinki, Finland, 2013.
PDF (244 kB)
Info
See aaltodoc.aalto.fi ...

2012

36Sami Virpioja. Learning Constructions of Natural Language: Statistical Models and Evaluations. PhD thesis, Aalto University, December 2012.
Info
See lib.tkk.fi ...
35Sami Virpioja, Mari-Sanna Paukkeri, Abhishek Tripathi, Tiina Lindh-Knuutila, and Krista Lagus. Evaluating vector space models with canonical correlation analysis. Natural Language Engineering, 18(03):399–436, July 2012.
PDF (922 kB)
Info
See dx.doi.org ...
34Sami Virpioja. Evaluation methods for unsupervised natural language learning. In Sasu Tarkoma, Joni-Kristian Kämäräinen, and Tapio Pahikkala, editors, Federated Computer Science Event 2012, number B-2012-1 in Department of Computer Science Series of Publications B, pages 66–67, Helsinki, Finland, 2012. Department of Computer Science, University of Helsinki.
Info

2011

33Sami Virpioja, Minna Lehtonen, Annika Hultén, Riitta Salmelin, and Krista Lagus. Predicting reaction times in word recognition by unsupervised learning of morphology. In Timo Honkela, Wlodzislaw Duch, Mark Girolami, and Samuel Kaski, editors, Artificial Neural Networks and Machine Learning — ICANN 2011, volume 6791 of Lecture Notes in Computer Science, pages 275–282. Springer Berlin / Heidelberg, June 2011.
Info
See www.springerlink.com ...
32Sami Virpioja, Oskar Kohonen, and Krista Lagus. Evaluating the effect of word frequencies in a probabilistic generative model of morphology. In Bolette Sandford Pedersen, Gunta Nešpore, and Inguna Skadina, editors, Proceedings of the 18th Nordic Conference of Computational Linguistics (NODALIDA 2011), volume 11 of NEALT Proceedings Series, pages 230–237. Northern European Association for Language Technology, Riga, Latvia, May 2011.
Info
See hdl.handle.net ...
31Sami Virpioja, Ville T. Turunen, Sebastian Spiegler, Oskar Kohonen, and Mikko Kurimo. Empirical comparison of evaluation methods for unsupervised learning of morphology. Traitement Automatique des Langues, 52(2):45–90, 2011.
Info
See www.atala.org ...

2010

30Sami Virpioja, Oskar Kohonen, and Krista Lagus. Unsupervised morpheme analysis with Allomorfessor. In Multilingual Information Access Evaluation I. Text Retrieval Experiments: 10th Workshop of the Cross-Language Evaluation Forum, CLEF 2009, Corfu, Greece, September 30 – October 2, 2009, Revised Selected Papers, volume 6241 of Lecture Notes in Computer Science, pages 609–616. Springer Berlin / Heidelberg, September 2010.
Info
See www.springerlink.com ...
29Mikko Kurimo, Sami Virpioja, Ville T. Turunen, Graeme W. Blackwood, and William Byrne. Overview and results of Morpho Challenge 2009. In Multilingual Information Access Evaluation I. Text Retrieval Experiments: 10th Workshop of the Cross-Language Evaluation Forum, CLEF 2009, Corfu, Greece, September 30 – October 2, 2009, Revised Selected Papers, volume 6241 of Lecture Notes in Computer Science, pages 578–597. Springer Berlin / Heidelberg, September 2010.
Info
See www.springerlink.com ...
28Mikko Kurimo, Sami Virpioja, and Ville T. Turunen (Eds.). Proceedings of the Morpho Challenge 2010 workshop. Technical Report TKK-ICS-R37, Aalto University School of Science and Technology, Department of Information and Computer Science, Espoo, Finland, September 2010.
Info
See www.cis.hut.fi ...
27Mikko Kurimo, Sami Virpioja, and Ville T. Turunen. Overview and results of Morpho Challenge 2010. In Proceedings of the Morpho Challenge 2010 Workshop, pages 7–24, Espoo, Finland, September 2010. Aalto University School of Science and Technology, Department of Information and Computer Science. Technical Report TKK-ICS-R37.
Info
See www.cis.hut.fi ...
26Oskar Kohonen, Sami Virpioja, Laura Leppänen, and Krista Lagus. Semi-supervised extensions to Morfessor Baseline. In Mikko Kurimo, Sami Virpioja, and Ville T. Turunen, editors, Proceedings of the Morpho Challenge 2010 Workshop, pages 30–34, Espoo, Finland, September 2010. Aalto University School of Science and Technology, Department of Information and Computer Science. Technical Report TKK-ICS-R37. Extended abstract.
PDF (60 kB)
Info
See www.cis.hut.fi ...
25Abhishek Tripathi, Arto Klami, and Sami Virpioja. Bilingual sentence matching using kernel CCA. In Proceedings of the 2010 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2010), pages 130–135, Kittilä, Finland, August 2010. IEEE.
Info
See ieeexplore.ieee.org ...
24Mikko Kurimo, Sami Virpioja, Ville Turunen, and Krista Lagus. Morpho challenge 2005-2010: Evaluations and results. In Proceedings of the 11th Meeting of the ACL Special Interest Group on Computational Morphology and Phonology, pages 87–95, Uppsala, Sweden, July 2010. Association for Computational Linguistics.
Info
See www.aclweb.org ...
23Oskar Kohonen, Sami Virpioja, and Krista Lagus. Semi-supervised learning of concatenative morphology. In Proceedings of the 11th Meeting of the ACL Special Interest Group on Computational Morphology and Phonology, pages 78–86, Uppsala, Sweden, July 2010. Association for Computational Linguistics.
Info
See www.aclweb.org ...
22Sami Virpioja, André Mansikkaniemi, Jaakko Väyrynen, and Mikko Kurimo. Applying morphological decompositions to statistical machine translation. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, pages 201–206. Association for Computational Linguistics, July 2010.
Info
See www.statmt.org ...
21Tommi Vatanen, Jaakko J. Väyrynen, and Sami Virpioja. Language identification of short text segments with n-gram models. In Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, and Daniel Tapias, editors, Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10), Valletta, Malta, May 2010. European Language Resources Association (ELRA).
Info
See www.lrec-conf.org ...

2009

20Oskar Kohonen, Sami Virpioja, and Krista Lagus. A constructionist approach to grammar inference. In NIPS Workshop on Grammar Induction, Representation of Language and Language Learning, Whistler, Canada, December 2009. Extended abstract.
Info
See www.cs.ucl.ac.uk ...
19Oskar Kohonen, Sami Virpioja, and Mikaela Klami. Allomorfessor: Towards unsupervised morpheme analysis. In Evaluating Systems for Multilingual and Multimodal Information Access: 9th Workshop of the Cross-Language Evaluation Forum, CLEF 2008, Aarhus, Denmark, September 17–19, 2008, Revised Selected Papers, volume 5706 of Lecture Notes in Computer Science, pages 975–982. Springer Berlin / Heidelberg, September 2009.
Info
See www.springerlink.com ...
18Sami Virpioja and Oskar Kohonen. Unsupervised morpheme analysis with Allomorfessor. In Working Notes for the CLEF 2009 Workshop, Corfu, Greece, September 2009.
PDF (156 kB)
Info
17Mikko Kurimo, Sami Virpioja, Ville T. Turunen, Graeme W. Blackwood, and William Byrne. Overview and results of Morpho Challenge 2009. In Working Notes for the CLEF 2009 Workshop, Corfu, Greece, September 2009.
Info
16Krista Lagus, Mathias Creutz, Sami Virpioja, and Oskar Kohonen. Morpheme segmentation by optimizing two-part MDL codes. In 2009 Workshop on Information Theoretic Methods in Science and Engineering (WITMSE), Tampere, Finland, August 2009. Extended abstract.
Info
See sp.cs.tut.fi ...
15Adrià de Gispert, Sami Virpioja, Mikko Kurimo, and William Byrne. Minimum bayes risk combination of translation hypotheses from alternative morphological decompositions. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers, pages 73–76, Boulder, USA, June 2009. Association for Computational Linguistics.
Info
See www.aclweb.org ...
14Mikko Kurimo, Teemu Hirsimäki, Ville Turunen, Sami Virpioja, and Niklas Raatikainen. Unsupervised decomposition of words for speech recognition and retrieval. In Proceedings of the 13th International Conference Speech and Computer, SPECOM 2009, pages 23–28, St. Petersburg, Russia, June 21–25 2009.
Info
13Mikko Kurimo, Sami Virpioja, Ville Turunen, and Teemu Hirsimäki. Morpho challenge - evaluation of algorithms for unsupervised learning of morphology in various tasks and languages. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Demonstration Session, pages 13–16, Boulder, Colorado, June 2009. Association for Computational Linguistics. Abstract of a demonstration.
Info
See www.aclweb.org ...
12Krista Lagus, Oskar Kohonen, and Sami Virpioja. Towards unsupervised learning of constructions from text. In Proceedings of the Workshop on Extracting and Using Constructions in NLP of the 17th Nordic Conference on Computational Linguistics (NODALIDA), Odense, Denmark, May 2009. SICS Technical Report T2009:10.
Info
See www.sics.se ...
11Mathias Creutz, Sami Virpioja, and Anna Kovaleva. Web augmentation of language models for continuous speech recognition of SMS text messages. In Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009), pages 157–165, Athens, Greece, March 2009. Association for Computational Linguistics.
Info
See www.aclweb.org ...

2008

10Timo Honkela, Sami Virpioja, and Jaakko J. Väyrynen. Adaptive translation: Finding interlingual mappings using self-organizing maps. In Proceedings of the 18th International Conference on Artificial Neural Networks (ICANN 2008), pages 603–612, Prague, Czech Republic, September 2008.
Info
See www.springerlink.com ...
9Oskar Kohonen, Sami Virpioja, and Mikaela Klami. Allomorfessor: Towards unsupervised morpheme analysis. In Working Notes for the CLEF 2008 Workshop, Aarhus, Denmark, September 2008.
Info

2007

8Sami Virpioja, Jaakko J. Väyrynen, Mathias Creutz, and Markus Sadeniemi. Morphology-aware statistical machine translation based on morphs induced in an unsupervised manner. In Proceedings of the Machine Translation Summit XI, pages 491–498, Copenhagen, Denmark, September 2007.
PDF (96 kB)
Info
7Vesa Siivola, Teemu Hirsimäki, and Sami Virpioja. On growing and pruning Kneser-Ney smoothed n-gram models. IEEE Transactions on Audio, Speech and Language Processing, 15(5):1617–1624, July 2007.
Info
See ieeexplore.ieee.org ...

2006

6Teemu Hirsimäki, Mathias Creutz, Vesa Siivola, Mikko Kurimo, Sami Virpioja, and Janne Pylkkönen. Unlimited vocabulary speech recognition with morph language models applied to Finnish. Computer Speech and Language, 20(4):515–541, October 2006.
Info
See www.sciencedirect.com ...
5Sami Virpioja and Mikko Kurimo. Compact n-gram models by incremental growing and clustering of histories. In Proceedings of 9th International Conference on Spoken Language Processing (Interspeech 2006 – ICSLP), pages 1037–1040, Pittsburgh, PA, USA, September 2006.
PDF (68 kB)
Info
4Mathias Creutz, Krista Lagus, and Sami Virpioja. Unsupervised morphology induction using Morfessor. In A. Yli-Jyrä, L. Karttunen, and J. Karhumäki, editors, Finite-State Methods and Natural Language Processing (FSMNLP 2005), volume 4002 of Lecture Notes in Computer Science, pages 300–301. Springer-Verlag Berlin Heidelberg, 2006. Abstract of a software demo.
Info
See www.springerlink.com ...

2005

3Sami Virpioja. New methods for statistical natural language modeling. Master's thesis, Helsinki University of Technology, Department of Computer Science and Engineering, Laboratory of Computer and Information Science, December 2005.
PDF (603 kB)
Info
2Krista Lagus, Mathias Creutz, and Sami Virpioja. Latent linguistic codes for morphemes using independent component analysis. In A. Cangelosi, G. Bugmann, and R. Borisyuk, editors, Modeling language, cognition and action: Proceedings of the Ninth Neural Computation and Psychology Workshop (NCPW9), Plymouth, England, September 2005.
Info
1Mathias Creutz, Krista Lagus, Krister Lindén, and Sami Virpioja. Morfessor and Hutmegs: Unsupervised morpheme segmentation for highly-inflecting and compounding languages. In Proceedings of the Second Baltic Conference on Human Language Technologies, pages 107–112, Tallinn, Estonia, April 2005.
Info