Skip to main content

Computational Modeling as a Methodology for Studying Human Language Learning

  • Chapter
  • First Online:
  • 1058 Accesses

Abstract

The nature and amount of information needed for learning a natural language, and the underlying mechanisms involved in this process, are the subject of much debate: how is the knowledge of language represented in the human brain? Is it possible to learn a language from usage data only, or is some sort of innate knowledge and/or bias needed to boost the process? Are different aspects of language learned in order? These are topics of interest to (psycho)linguists who study human language acquisition, as well as to computational linguists who develop the knowledge sources necessary for large-scale natural language processing systems. Children are the ultimate subjects of any study of language learnability. They learn language with ease, in a short period of time and their acquired knowledge of language is flexible and robust.

Excerpts of this chapter have been published in Alishahi, A. (2010), Computational Modeling of Human Language Acquisition, Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers [2].

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    On the other hand, it has been suggested that the language learner can estimate the “typical” rate of generalization for each syntactic form, whose distribution serves as “indirect” negative evidence [15, 42].

References

  1. Akhtar, N. (1999). Acquiring basic word order: Evidence for data-driven learning of syntactic structure. Journal of Child Language, 26, 339–356.

    Article  Google Scholar 

  2. Alishahi, A. (2010). Computational modeling of human language acquisition (Synthesis lectures on human language technologies). San Rafael: Morgan & Claypool Publishers.

    Google Scholar 

  3. Bowerman, M. (1982). Evaluating competing linguistic models with language acquisition data: Implications of developmental errors with causative verbs. Quaderni di semantica, 3, 5–66.

    Google Scholar 

  4. Brent, M. R., & Cartwright, T. A. (1996). Distributional regularity and phonotactic constraints are useful for segmentation. Cognition, 61(1–2), 93–125.

    Article  Google Scholar 

  5. Broen, P. A. (1972). The verbal environment of the language-learning child. Washington: American Speech and Hearing Association.

    Google Scholar 

  6. Burnard, L. (2000). Users reference guide for the British National Corpus (Technical Report). Oxford University Computing Services.

    Google Scholar 

  7. Buttery, P., & Korhonen, A. (2007). I will shoot your shopping down and you can shoot all my tins: Automatic lexical acquisition from the CHILDES database. In Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition (pp. 33–40). Prague: Association for Computational Linguistics.

    Google Scholar 

  8. Chater, N., & Manning, C. D. (2006). Probabilistic models of language processing and acquisition. Trends in Cognitive Science, 10(7), 335–344.

    Article  Google Scholar 

  9. Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge: MIT Press.

    Google Scholar 

  10. Chomsky, N. (1975). The logical structure of linguistic theory. New York: Plenum press.

    Google Scholar 

  11. Chomsky, N. (1980). Rules and representations. Oxford: Basil Blackwell.

    Google Scholar 

  12. Chomsky, N. (1981). Lectures on government and binding. Dordrecht/Cinnaminson: Mouton de Gruyter.

    Google Scholar 

  13. Chomsky, N. (1986). Knowledge of language: Its nature, origin, and use. New York: Praeger Publishers.

    Google Scholar 

  14. Clark, E. V. (2009). First language acquisition (2nd ed.). Cambridge/New York: Cambridge University Press.

    Book  Google Scholar 

  15. Clark, A., & Lappin, S. (2010). Linguistic nativism and the poverty of stimulus. Oxford/Malden, MA: Wiley Blackwell.

    Google Scholar 

  16. Cullicover, P. W. (1999). Syntactic nuts. Oxford/New York: Oxford University Press.

    Google Scholar 

  17. De Marcken, C. G. (1996). Unsupervised language acquisition. Ph.D. thesis, MIT.

    Google Scholar 

  18. Dominey, P., & Boucher, J. (2005). Learning to talk about events from narrated video in a construction grammar framework. Artificial Intelligence, 167(1–2), 31–61.

    Article  Google Scholar 

  19. Dowman, M. (2000). Addressing the learnability of verb subcategorizations with Bayesian inference. In L. R. Gleitman & A. K. Joshi (Eds.), Proceedings of the Twenty-Second Annual Conference of the Cognitive Science Society. Mahwah/London: Erlbaum

    Google Scholar 

  20. Elman, J. (2001). Connectionism and language acquisition. In Essential readings in language acquisition. Oxford: Blackwell.

    Google Scholar 

  21. Fisher, C. (1996). Structural limits on verb mapping: The role of analogy in children’s interpretations of sentences. Cognitive Psychology, 31(1), 41–81.

    Article  Google Scholar 

  22. Francis, W., Kučera, H., & Mackie, A. (1982). Frequency analysis of English usage: Lexicon and grammar. Boston: Houghton Mifflin Harcourt (HMH).

    Google Scholar 

  23. Frank, M., Goodman, N., & Tenenbaum, J. (2008). A Bayesian framework for cross-situational word learning. Advances in Neural Information Processing Systems, 20, 457–464.

    Google Scholar 

  24. Frazier, L., & Fodor, J. D. (1978). The sausage machine: A new two-stage parsing model. Cognition, 13, 187–222.

    Article  Google Scholar 

  25. Gelman, S., & Taylor, M. (1984). How two-year-old children interpret proper and common names for unfamiliar objects. Child Development, 55, 1535–1540.

    Article  Google Scholar 

  26. Gibson, E., & Wexler, K. (1994). Triggers. Linguistic Inquiry, 25, 407–454.

    Google Scholar 

  27. Godfrey, J., Holliman, E., & McDaniel, J. (1992). SWITCHBOARD: Telephone speech corpus for research and development. In 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1992. ICASSP-92 (Vol. 1). New York: IEEE

    Google Scholar 

  28. Gold, E. M. (1967). Language identification in the limit. Information and Control, 10(5), 447–474.

    Article  MATH  Google Scholar 

  29. Goldberg, A. E. (1999). Emergence of the semantics of argument structure constructions. In The emergence of language (Carnegie Mellon Symposia on Cognition Series, pp. 197–212). Mahwah: Lawrence Erlbaum Associates

    Google Scholar 

  30. Grünwald, P. (1996). A minimum description length approach to grammar inference. In S. Wermter, E. Riloff, & G. Scheler (Eds.), Connectionist, statistical and symbolic approaches to learning for natural language processing (Lecture Notes in Computer Science, Vol. 1040, pp. 203–216). Berlin/New York: Springer.

    Google Scholar 

  31. Hsu, A. S., & Chater, N. (2010). The logical problem of language acquisition: A probabilistic perspective. Cognitive Science, 34(6), 972–1016.

    Article  Google Scholar 

  32. Jurafsky, D. (1996). A probabilistic model of lexical and syntactic access and disambiguation. Cognitive Science, 20, 137–194.

    Article  Google Scholar 

  33. Keller, B., & Lutz, R. (1997). Evolving stochastic context-free grammars from examples using a minimum description length principle. In Workshop on Automata Induction Grammatical Inference and Language Acquisition, ICML-97. San Francisco: Morgan Kaufmann Publishers

    Google Scholar 

  34. Leech, G. (1992). 100 million words of English: The British National Corpus (BNC). Language Research, 28(1), 1–13.

    MathSciNet  Google Scholar 

  35. Legate, J., & Yang, C. (2002). Empirical re-assessment of stimulus poverty arguments. Linguistic Review, 19(1/2), 151–162.

    Google Scholar 

  36. Leonard, L. (2000). Children with specific language impairment. Cambridge: MIT Press.

    Google Scholar 

  37. Li, M., & Vitányi, P. M. B. (1995). Computational machine learning in theory and praxis. In J. van Leeuwen (Ed.), Computer science today (Lecture notes in computer science, Vol. 1000). Heidelberg: Springer.

    Google Scholar 

  38. MacWhinney, B. (1982). Basic syntactic processes. In S. Kuczaj (Ed.), Language development: Syntax and semantics (Vol. 1, pp. 73–136). Hillsdale, NJ: Lawrence Erlbaum.

    Google Scholar 

  39. MacWhinney, B. (1987). The competition model. In B. MacWhinney (Ed.), Mechanisms of language acquisition. Hillsdale, NJ: Erllbaum.

    Google Scholar 

  40. MacWhinney, B. (1993). Connections and symbols: Closing the gap. Cognition, 49, 291–296.

    Article  Google Scholar 

  41. MacWhinney, B. (1995). The CHILDES project: Tools for analyzing talk (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  42. MacWhinney, B. (2004). A multiple process solution to the logical problem of language acquisition. Journal of Child Language, 31, 883–914.

    Article  Google Scholar 

  43. MacWhinney, B., Bird, S., Cieri, C., & Martell, C. (2004). TalkBank: Building an open unified multimodal database of communicative interaction. In Proceedings of the Fourth International Conference on Language Resources and Evaluation, Lisbon (pp. 525–528). Paris: ELRA

    Google Scholar 

  44. Marcus, G. F. (1993). Negative evidence in language acquisition. Cognition, 46, 53–85.

    Article  Google Scholar 

  45. Marcus, G. F., Pinker, S., Ullman, M., Hollander, M., Rosen, T. J., & Xu, F. (1992). Overregularization in language acquisition (Monographs of the society for research in child development, Vol. 57 (4, Serial No. 228)). Chicago: University of Chicago Press

    Google Scholar 

  46. Marcus, M., Santorini, B., & Marcinkiewicz, M. (1994). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2), 313–330.

    Google Scholar 

  47. Marr, D. (1982). Vision. San Francisco, CA: W. H. Freeman.

    Google Scholar 

  48. McClelland, J. L., Rumelhart, D. E., & The PDP Research Group (1986). Parallel distributed processing: Explorations in the microstructure of cognition (Vol. 2). Cambridge, MA: Bradford Books/MIT Press.

    Google Scholar 

  49. Mintz, T. (2003). Frequent frames as a cue for grammatical categories in child directed speech. Cognition, 90(1), 91–117.

    Article  Google Scholar 

  50. Parisse, C., & Le Normand, M. T. (2000). Automatic disambiguation of the morphosyntax in spoken language corpora. Behavior Research Methods, Instruments, and Computers, 32, 468–481.

    Article  Google Scholar 

  51. Perfors, A., Tenenbaum, J., & Wonnacott, E. (2010). Variability, negative evidence, and the acquisition of verb argument constructions. Journal of Child Language, 37, 607–642.

    Article  Google Scholar 

  52. Pinker, S. (1989). Learnability and cognition: The acquisition of argument structure. Cambridge, MA: MIT Press.

    Google Scholar 

  53. Pinker, S. (1994). How could a child use verb syntax to learn verb semantics? Lingua, 92, 377–410.

    Article  Google Scholar 

  54. Pullum, G., & Scholz, B. (2002). Empirical assessment of stimulus poverty arguments. Linguistic Review, 19(1/2), 9–50.

    Google Scholar 

  55. Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14(5), 465–471.

    Article  MATH  Google Scholar 

  56. Roy, D. (2009). New horizons in the study of child language acquisition. In Proceedings of Interspeech 2009, Brighton. Grenoble: ISCA

    Google Scholar 

  57. Rumelhart, D., & McClelland, J. (1987). Learning the past tenses of English verbs: Implicit rules or parallel distributed processing. Mechanisms of language acquisition (pp. 195–248). Hillsdale: Erlbaum.

    Google Scholar 

  58. Sagae, K., Davis, E., Lavie, A., MacWhinney, B., & Wintner, S. (2007). High-accuracy annotation and parsing of CHILDES transcripts. In Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition (pp. 25–32). Prague: Association for Computational Linguistics.

    Google Scholar 

  59. Sagae, K., Davis, E., Lavie, A., MacWhinney, B., & Wintner, S. (2010). Morphosyntactic annotation of CHILDES transcripts. Journal of Child Language, 37(03), 705–729.

    Article  Google Scholar 

  60. Steedman, M., Baldridge, J., Bozsahin, C., Clark, S., Curran, J., & Hockenmaier, J. (2005). Grammar acquisition by child and machine: The combinatory manifesto. Invited Talk at the Ninth Conference on Computational Natural Language Learning (CoNLL-2005), Ann Arbor.

    Google Scholar 

  61. Tanenhaus, M., Spivey-Knowlton, M., Eberhard, K., & Sedivy, J. (1995). Integration of visual and linguistic information in spoken language comprehension. Science, 268(5217), 1632.

    Article  Google Scholar 

  62. Tenenbaum, J., Griffiths, T., & Kemp, C. (2006). Theory-based Bayesian models of inductive learning and reasoning. Trends in Cognitive Sciences, 10(7), 309–318.

    Article  Google Scholar 

  63. Tomasello, M. (2000). Do young children have adult syntactic competence? Cognition, 74, 209–253.

    Article  Google Scholar 

  64. Tomasello, M. (2003). Constructing a language: A usage-based theory of language acquisition. Cambridge: Harvard University Press.

    Google Scholar 

  65. Tomasello, M., Akhtar, N., Dodson, K., & Rekau, L. (1997). Differential productivity in young children’s use of nouns and verbs. Journal of Child Language, 24(02), 373–387.

    Article  Google Scholar 

  66. Villavicencio, A. (2002). The acquisition of a unification-based generalised categorial grammar. Ph.D. thesis, Computer Laboratory, University of Cambridge.

    Google Scholar 

  67. Yang, C. (2002). Knowledge and learning in natural language. Oxford/New York: Oxford University Press.

    Google Scholar 

  68. Yu, C., & Ballard, D. (2007). A unified model of early word learning: Integrating statistical and social cues. Neurocomputing, 70(13–15), 2149–2165.

    Article  Google Scholar 

  69. Yu, C., & Smith, L. (2006). Statistical cross-situational learning to build word-to-world mappings. In Proceedings of the 28th Annual Meeting of the Cognitive Science Society, Vancouver. Citeseer.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thierry Poibeau .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Poibeau, T., Villavicencio, A., Korhonen, A., Alishahi, A. (2013). Computational Modeling as a Methodology for Studying Human Language Learning. In: Villavicencio, A., Poibeau, T., Korhonen, A., Alishahi, A. (eds) Cognitive Aspects of Computational Language Acquisition. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31863-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31863-4_1

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31862-7

  • Online ISBN: 978-3-642-31863-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics