Abstract
A central question in the study of the mental lexicon is how morphologically complex words are processed. We consider this question from the viewpoint of statistical models of morphology. As an indicator of the mental processing cost in the brain, we use reaction times to words in a visual lexical decision task on Finnish nouns. Statistical correlation between a model and reaction times is employed as a goodness measure of the model. In particular, we study Morfessor, an unsupervised method for learning concatenative morphology. The results for a set of inflected and monomorphemic Finnish nouns reveal that the probabilities given by Morfessor, especially the Categories-MAP version, show considerably higher correlations to the reaction times than simple word statistics such as frequency, morphological family size, or length. These correlations are also higher than when any individual test subject is viewed as a model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alegre, M., Gordon, P.: Frequency effects and the representational status of regular inflections. Journal of Memory and Language 40, 41–61 (1999)
Bertram, R., Baayen, R., Schreuder, R.: Effects of family size for complex words. Journal of Memory and Language 42, 390–405 (2000)
Butterworth, B.: Lexical representation. In: Butterworth, B. (ed.) Language Production, pp. 257–294. Academic Press, London (1983)
Chen, S.F., Goodman, J.: An empirical study of smoothing techniques for language modeling. Computer Speech & Language 13(4), 359–393 (1999)
Creutz, M., Lagus, K.: Unsupervised morpheme segmentation and morphology induction from text corpora using Morfessor 1.0. Tech. Rep. A81. Publications in Computer and Information Science. Helsinki University of Technology (2005)
Creutz, M., Lagus, K.: Unsupervised models for morpheme segmentation and morphology learning. ACM Transactions on Speech and Language Processing 4(1) (January 2007)
Karlsson, F.: Suomen kielen äänne- ja muotorakenne (The Phonological and Morphological Structure of Finnish). Werner Söderström, Juva (1983)
Kurimo, M., Creutz, M., Varjokallio, M.: Morpho challenge evaluation using a linguistic gold standard. In: Peters, C., Jijkoun, V., Mandl, T., Müller, H., Oard, D.W., Peñas, A., Petras, V., Santos, D. (eds.) CLEF 2007. LNCS, vol. 5152, pp. 864–872. Springer, Heidelberg (2008)
Lehtonen, M., Cunillera, T., Rodríguez-Fornells, A., Hultén, A., Tuomainen, J., Laine, M.: Recognition of morphologically complex words in Finnish: evidence from event-related potentials. Brain Research 1148, 123–137 (2007)
Lehtonen, M., Laine, M.: How word frequency affects morphological processing in monolinguals and bilinguals. Bilingualism: Language and Cognition 6, 213–225 (2003)
Niemi, J., Laine, M., Tuominen, J.: Cognitive morphology in Finnish: foundations of a new model. Language and Cognitive Processes 9, 423–446 (1994)
Quasthoff, U., Richter, M., Biemann, C.: Corpus portal for search in monolingual corpora. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation, LREC 2006, Genoa, Italy, pp. 1799–1802 (2006)
Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)
Siivola, V., Hirsimäki, T., Virpioja, S.: On growing and pruning Kneser-Ney smoothed n-gram models. IEEE Transactions on Audio, Speech & Language Processing 15(5), 1617–1624 (2007)
Soveri, A., Lehtonen, M., Laine, M.: Word frequency and morphological processing revisited. The Mental Lexicon 2, 359–385 (2007)
Taft, M.: Recognition of affixed words and the word frequency effect. Memory and Cognition 7, 263–272 (1979)
Taft, M.: Morphological decomposition and the reverse base frequency effect. The Quarterly Journal of Experimental Psychology A 57, 745–765 (2004)
The Department of General Linguistics, University of Helsinki and Research Institute for the Languages of Finland (gatherers): Finnish Parole Corpus (1996–1998), available through CSC, http://www.csc.fi/
Tiedemann, J.: News from OPUS — A collection of multilingual parallel corpora with tools and interfaces. In: Recent Advances in Natural Language Processing, vol. 5, pp. 237–248. John Benjamins, Amsterdam (2009)
Vartiainen, J., Aggujaro, S., Lehtonen, M., Hultén, A., Laine, M., Salmelin, R.: Neural dynamics of reading morphologically complex words. NeuroImage 47, 2064–2072 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Virpioja, S., Lehtonen, M., Hultén, A., Salmelin, R., Lagus, K. (2011). Predicting Reaction Times in Word Recognition by Unsupervised Learning of Morphology. In: Honkela, T., Duch, W., Girolami, M., Kaski, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2011. ICANN 2011. Lecture Notes in Computer Science, vol 6791. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21735-7_34
Download citation
DOI: https://doi.org/10.1007/978-3-642-21735-7_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21734-0
Online ISBN: 978-3-642-21735-7
eBook Packages: Computer ScienceComputer Science (R0)