Computational Modeling as a Methodology for Studying Human Language Learning

Poibeau, Thierry; Villavicencio, Aline; Korhonen, Anna; Alishahi, Afra

doi:10.1007/978-3-642-31863-4_1

Computational Modeling as a Methodology for Studying Human Language Learning

Thierry Poibeau⁵,
Aline Villavicencio⁶,
Anna Korhonen^7,8 &
…
Afra Alishahi⁹

Chapter
First Online: 01 January 2012

1058 Accesses

Part of the book series: Theory and Applications of Natural Language Processing ((NLP))

Abstract

The nature and amount of information needed for learning a natural language, and the underlying mechanisms involved in this process, are the subject of much debate: how is the knowledge of language represented in the human brain? Is it possible to learn a language from usage data only, or is some sort of innate knowledge and/or bias needed to boost the process? Are different aspects of language learned in order? These are topics of interest to (psycho)linguists who study human language acquisition, as well as to computational linguists who develop the knowledge sources necessary for large-scale natural language processing systems. Children are the ultimate subjects of any study of language learnability. They learn language with ease, in a short period of time and their acquired knowledge of language is flexible and robust.

Excerpts of this chapter have been published in Alishahi, A. (2010), Computational Modeling of Human Language Acquisition, Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers [2].

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
On the other hand, it has been suggested that the language learner can estimate the “typical” rate of generalization for each syntactic form, whose distribution serves as “indirect” negative evidence [15, 42].

References

Akhtar, N. (1999). Acquiring basic word order: Evidence for data-driven learning of syntactic structure. Journal of Child Language, 26, 339–356.
Article Google Scholar
Alishahi, A. (2010). Computational modeling of human language acquisition (Synthesis lectures on human language technologies). San Rafael: Morgan & Claypool Publishers.
Google Scholar
Bowerman, M. (1982). Evaluating competing linguistic models with language acquisition data: Implications of developmental errors with causative verbs. Quaderni di semantica, 3, 5–66.
Google Scholar
Brent, M. R., & Cartwright, T. A. (1996). Distributional regularity and phonotactic constraints are useful for segmentation. Cognition, 61(1–2), 93–125.
Article Google Scholar
Broen, P. A. (1972). The verbal environment of the language-learning child. Washington: American Speech and Hearing Association.
Google Scholar
Burnard, L. (2000). Users reference guide for the British National Corpus (Technical Report). Oxford University Computing Services.
Google Scholar
Buttery, P., & Korhonen, A. (2007). I will shoot your shopping down and you can shoot all my tins: Automatic lexical acquisition from the CHILDES database. In Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition (pp. 33–40). Prague: Association for Computational Linguistics.
Google Scholar
Chater, N., & Manning, C. D. (2006). Probabilistic models of language processing and acquisition. Trends in Cognitive Science, 10(7), 335–344.
Article Google Scholar
Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge: MIT Press.
Google Scholar
Chomsky, N. (1975). The logical structure of linguistic theory. New York: Plenum press.
Google Scholar
Chomsky, N. (1980). Rules and representations. Oxford: Basil Blackwell.
Google Scholar
Chomsky, N. (1981). Lectures on government and binding. Dordrecht/Cinnaminson: Mouton de Gruyter.
Google Scholar
Chomsky, N. (1986). Knowledge of language: Its nature, origin, and use. New York: Praeger Publishers.
Google Scholar
Clark, E. V. (2009). First language acquisition (2nd ed.). Cambridge/New York: Cambridge University Press.
Book Google Scholar
Clark, A., & Lappin, S. (2010). Linguistic nativism and the poverty of stimulus. Oxford/Malden, MA: Wiley Blackwell.
Google Scholar
Cullicover, P. W. (1999). Syntactic nuts. Oxford/New York: Oxford University Press.
Google Scholar
De Marcken, C. G. (1996). Unsupervised language acquisition. Ph.D. thesis, MIT.
Google Scholar
Dominey, P., & Boucher, J. (2005). Learning to talk about events from narrated video in a construction grammar framework. Artificial Intelligence, 167(1–2), 31–61.
Article Google Scholar
Dowman, M. (2000). Addressing the learnability of verb subcategorizations with Bayesian inference. In L. R. Gleitman & A. K. Joshi (Eds.), Proceedings of the Twenty-Second Annual Conference of the Cognitive Science Society. Mahwah/London: Erlbaum
Google Scholar
Elman, J. (2001). Connectionism and language acquisition. In Essential readings in language acquisition. Oxford: Blackwell.
Google Scholar
Fisher, C. (1996). Structural limits on verb mapping: The role of analogy in children’s interpretations of sentences. Cognitive Psychology, 31(1), 41–81.
Article Google Scholar
Francis, W., Kučera, H., & Mackie, A. (1982). Frequency analysis of English usage: Lexicon and grammar. Boston: Houghton Mifflin Harcourt (HMH).
Google Scholar
Frank, M., Goodman, N., & Tenenbaum, J. (2008). A Bayesian framework for cross-situational word learning. Advances in Neural Information Processing Systems, 20, 457–464.
Google Scholar
Frazier, L., & Fodor, J. D. (1978). The sausage machine: A new two-stage parsing model. Cognition, 13, 187–222.
Article Google Scholar
Gelman, S., & Taylor, M. (1984). How two-year-old children interpret proper and common names for unfamiliar objects. Child Development, 55, 1535–1540.
Article Google Scholar
Gibson, E., & Wexler, K. (1994). Triggers. Linguistic Inquiry, 25, 407–454.
Google Scholar
Godfrey, J., Holliman, E., & McDaniel, J. (1992). SWITCHBOARD: Telephone speech corpus for research and development. In 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1992. ICASSP-92 (Vol. 1). New York: IEEE
Google Scholar
Gold, E. M. (1967). Language identification in the limit. Information and Control, 10(5), 447–474.
Article MATH Google Scholar
Goldberg, A. E. (1999). Emergence of the semantics of argument structure constructions. In The emergence of language (Carnegie Mellon Symposia on Cognition Series, pp. 197–212). Mahwah: Lawrence Erlbaum Associates
Google Scholar
Grünwald, P. (1996). A minimum description length approach to grammar inference. In S. Wermter, E. Riloff, & G. Scheler (Eds.), Connectionist, statistical and symbolic approaches to learning for natural language processing (Lecture Notes in Computer Science, Vol. 1040, pp. 203–216). Berlin/New York: Springer.
Google Scholar
Hsu, A. S., & Chater, N. (2010). The logical problem of language acquisition: A probabilistic perspective. Cognitive Science, 34(6), 972–1016.
Article Google Scholar
Jurafsky, D. (1996). A probabilistic model of lexical and syntactic access and disambiguation. Cognitive Science, 20, 137–194.
Article Google Scholar
Keller, B., & Lutz, R. (1997). Evolving stochastic context-free grammars from examples using a minimum description length principle. In Workshop on Automata Induction Grammatical Inference and Language Acquisition, ICML-97. San Francisco: Morgan Kaufmann Publishers
Google Scholar
Leech, G. (1992). 100 million words of English: The British National Corpus (BNC). Language Research, 28(1), 1–13.
MathSciNet Google Scholar
Legate, J., & Yang, C. (2002). Empirical re-assessment of stimulus poverty arguments. Linguistic Review, 19(1/2), 151–162.
Google Scholar
Leonard, L. (2000). Children with specific language impairment. Cambridge: MIT Press.
Google Scholar
Li, M., & Vitányi, P. M. B. (1995). Computational machine learning in theory and praxis. In J. van Leeuwen (Ed.), Computer science today (Lecture notes in computer science, Vol. 1000). Heidelberg: Springer.
Google Scholar
MacWhinney, B. (1982). Basic syntactic processes. In S. Kuczaj (Ed.), Language development: Syntax and semantics (Vol. 1, pp. 73–136). Hillsdale, NJ: Lawrence Erlbaum.
Google Scholar
MacWhinney, B. (1987). The competition model. In B. MacWhinney (Ed.), Mechanisms of language acquisition. Hillsdale, NJ: Erllbaum.
Google Scholar
MacWhinney, B. (1993). Connections and symbols: Closing the gap. Cognition, 49, 291–296.
Article Google Scholar
MacWhinney, B. (1995). The CHILDES project: Tools for analyzing talk (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates.
Google Scholar
MacWhinney, B. (2004). A multiple process solution to the logical problem of language acquisition. Journal of Child Language, 31, 883–914.
Article Google Scholar
MacWhinney, B., Bird, S., Cieri, C., & Martell, C. (2004). TalkBank: Building an open unified multimodal database of communicative interaction. In Proceedings of the Fourth International Conference on Language Resources and Evaluation, Lisbon (pp. 525–528). Paris: ELRA
Google Scholar
Marcus, G. F. (1993). Negative evidence in language acquisition. Cognition, 46, 53–85.
Article Google Scholar
Marcus, G. F., Pinker, S., Ullman, M., Hollander, M., Rosen, T. J., & Xu, F. (1992). Overregularization in language acquisition (Monographs of the society for research in child development, Vol. 57 (4, Serial No. 228)). Chicago: University of Chicago Press
Google Scholar
Marcus, M., Santorini, B., & Marcinkiewicz, M. (1994). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2), 313–330.
Google Scholar
Marr, D. (1982). Vision. San Francisco, CA: W. H. Freeman.
Google Scholar
McClelland, J. L., Rumelhart, D. E., & The PDP Research Group (1986). Parallel distributed processing: Explorations in the microstructure of cognition (Vol. 2). Cambridge, MA: Bradford Books/MIT Press.
Google Scholar
Mintz, T. (2003). Frequent frames as a cue for grammatical categories in child directed speech. Cognition, 90(1), 91–117.
Article Google Scholar
Parisse, C., & Le Normand, M. T. (2000). Automatic disambiguation of the morphosyntax in spoken language corpora. Behavior Research Methods, Instruments, and Computers, 32, 468–481.
Article Google Scholar
Perfors, A., Tenenbaum, J., & Wonnacott, E. (2010). Variability, negative evidence, and the acquisition of verb argument constructions. Journal of Child Language, 37, 607–642.
Article Google Scholar
Pinker, S. (1989). Learnability and cognition: The acquisition of argument structure. Cambridge, MA: MIT Press.
Google Scholar
Pinker, S. (1994). How could a child use verb syntax to learn verb semantics? Lingua, 92, 377–410.
Article Google Scholar
Pullum, G., & Scholz, B. (2002). Empirical assessment of stimulus poverty arguments. Linguistic Review, 19(1/2), 9–50.
Google Scholar
Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14(5), 465–471.
Article MATH Google Scholar
Roy, D. (2009). New horizons in the study of child language acquisition. In Proceedings of Interspeech 2009, Brighton. Grenoble: ISCA
Google Scholar
Rumelhart, D., & McClelland, J. (1987). Learning the past tenses of English verbs: Implicit rules or parallel distributed processing. Mechanisms of language acquisition (pp. 195–248). Hillsdale: Erlbaum.
Google Scholar
Sagae, K., Davis, E., Lavie, A., MacWhinney, B., & Wintner, S. (2007). High-accuracy annotation and parsing of CHILDES transcripts. In Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition (pp. 25–32). Prague: Association for Computational Linguistics.
Google Scholar
Sagae, K., Davis, E., Lavie, A., MacWhinney, B., & Wintner, S. (2010). Morphosyntactic annotation of CHILDES transcripts. Journal of Child Language, 37(03), 705–729.
Article Google Scholar
Steedman, M., Baldridge, J., Bozsahin, C., Clark, S., Curran, J., & Hockenmaier, J. (2005). Grammar acquisition by child and machine: The combinatory manifesto. Invited Talk at the Ninth Conference on Computational Natural Language Learning (CoNLL-2005), Ann Arbor.
Google Scholar
Tanenhaus, M., Spivey-Knowlton, M., Eberhard, K., & Sedivy, J. (1995). Integration of visual and linguistic information in spoken language comprehension. Science, 268(5217), 1632.
Article Google Scholar
Tenenbaum, J., Griffiths, T., & Kemp, C. (2006). Theory-based Bayesian models of inductive learning and reasoning. Trends in Cognitive Sciences, 10(7), 309–318.
Article Google Scholar
Tomasello, M. (2000). Do young children have adult syntactic competence? Cognition, 74, 209–253.
Article Google Scholar
Tomasello, M. (2003). Constructing a language: A usage-based theory of language acquisition. Cambridge: Harvard University Press.
Google Scholar
Tomasello, M., Akhtar, N., Dodson, K., & Rekau, L. (1997). Differential productivity in young children’s use of nouns and verbs. Journal of Child Language, 24(02), 373–387.
Article Google Scholar
Villavicencio, A. (2002). The acquisition of a unification-based generalised categorial grammar. Ph.D. thesis, Computer Laboratory, University of Cambridge.
Google Scholar
Yang, C. (2002). Knowledge and learning in natural language. Oxford/New York: Oxford University Press.
Google Scholar
Yu, C., & Ballard, D. (2007). A unified model of early word learning: Integrating statistical and social cues. Neurocomputing, 70(13–15), 2149–2165.
Article Google Scholar
Yu, C., & Smith, L. (2006). Statistical cross-situational learning to build word-to-world mappings. In Proceedings of the 28th Annual Meeting of the Cognitive Science Society, Vancouver. Citeseer.
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire Langues, Textes, Traitements informatiques, Cognition, CNRS, Ecole Normale Supérieure and Université Sorbonne Nouvelle, Paris, France
Thierry Poibeau
Institute of Informatics, Federal University of Rio Grande do Sul, Av. Bento Gonçalves, 9500, Porto Alegre, Brazil
Aline Villavicencio
Computer Laboratory, University of Cambridge, Cambridge, CB3 0FD, UK
Anna Korhonen
Department of Theoretical and Applied Linguistics (DTAL), Cambridge, CB3 9DB, UK
Anna Korhonen
Department of Communication and Information Studies, Tilburg University, Tilburg, The Netherlands
Afra Alishahi

Authors

Thierry Poibeau
View author publications
You can also search for this author in PubMed Google Scholar
Aline Villavicencio
View author publications
You can also search for this author in PubMed Google Scholar
Anna Korhonen
View author publications
You can also search for this author in PubMed Google Scholar
Afra Alishahi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thierry Poibeau .

Editor information

Editors and Affiliations

Institute of Informatics, Federal University of Rio Grande do Sul, Av. Bento Gonçalves, Porto Alegre, 9500, Brazil
Aline Villavicencio
Universite Sorbonne Nouvelle, LATTICE-CNRS, Ecole Normale Superieure and, rue d'Ulm 45, Paris, 75005, France
Thierry Poibeau
Computer Laboratory, William Gates Building, University of Cambridge, Thomson Avenue 15 JJ, Cambridge, CB3 0FD, United Kingdom
Anna Korhonen
and Communication (TiCC), Tilburg University, Tilburg center for Cognition, Warandelaan 2, Tilburg, 5037, Netherlands
Afra Alishahi

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Poibeau, T., Villavicencio, A., Korhonen, A., Alishahi, A. (2013). Computational Modeling as a Methodology for Studying Human Language Learning. In: Villavicencio, A., Poibeau, T., Korhonen, A., Alishahi, A. (eds) Cognitive Aspects of Computational Language Acquisition. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31863-4_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-31863-4_1
Published: 27 September 2012
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31862-7
Online ISBN: 978-3-642-31863-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics