Skip to main content

Variable Selection in Logistic Regression: The British English Dative Alternation

  • Conference paper
Interfaces: Explorations in Logic, Language and Computation (ESSLLI 2008, ESSLLI 2009)

Abstract

This paper addresses the problem of selecting the ‘optimal’ variable subset in a logistic regression model for a medium-sized data set. As a case study, we take the British English dative alternation, where speakers and writers can choose between two – equally grammatical – syntactic constructions to express the same meaning. With 29 explanatory variables taken from the literature, we build two types of models: one with the verb sense included as a random effect, and one without a random effect. For each type, we build three different models by including all variables and keeping the significant ones, by successively adding the most predictive variable (forward selection), and by successively removing the least predictive variable (backward elimination). Seeing that the six approaches lead to six different variable selections (and thus six different models), we conclude that the selection of the ‘best’ model requires a substantial amount of linguistic expertise.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bates, D.: Fitting linear mixed models in R. R News 5(1), 27–30 (2005)

    Google Scholar 

  2. Blackwell, A.: Acquiring the English adjective lexicon: relationships with input properties and adjectival semantic typology. Child Language 32, 535–562 (2005)

    Article  Google Scholar 

  3. Bresnan, J., Cueni, A., Nikitina, T., Baayen, H.: Predicting the Dative Alternation. In: Bouma, G., Kraemer, I., Zwarts, J. (eds.) Cognitive Foundations of Interpretation, pp. 69–94. Royal Netherlands Academy of Science, Amsterdam (2007)

    Google Scholar 

  4. Burnard, L.: Reference Guide for the British National Corpus (XML Edition). Published for the British National Corpus Consortium. Research Technologies Service at Oxford University Computing Services (2007)

    Google Scholar 

  5. Godfrey, J., Holliman, E., McDaniel, J.: Switchboard: Telephone speech corpus for research and development. In: ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 517–520. IEEE Computer Society, Los Alamitos (1992)

    Chapter  Google Scholar 

  6. Greenbaum, S.: Comparing English Worldwide: The International Corpus of English. Clarendon, Oxford (1996)

    Google Scholar 

  7. Gries, S.: Towards a corpus-based identification of prototypical instances of constructions. Annual Review of Cognitive Linguistics 1, 1–27 (2003)

    Article  Google Scholar 

  8. Gries, S., Stefanowitsch, A.: Extending Collostructional Analysis: A Corpus-based Perspective on ‘Alternations’. International Journal of Corpus Linguistics 9, 97–129 (2004)

    Article  Google Scholar 

  9. Grondelaers, S., Speelman, D.: A variationist account of constituent ordering in presentative sentences in Belgian Dutch. Corpus Linguistics and Linguistic Theory 3(2), 161–193 (2007)

    Article  Google Scholar 

  10. Haspelmath, M.: Ditransitive alignment splits and inverse alignment. Functions of Language 14(1), 79–102 (2007)

    Article  Google Scholar 

  11. Izenman, A.: Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning. Springer, New York (2008)

    MATH  Google Scholar 

  12. Lapata, M.: Acquiring lexical generalizations from corpora: a case study for diathesis alternations. In: Proceedings of the 37th annual meeting of the Association for Computational Linguistics, pp. 397–404. Morgan Kaufmann, San Francisco (1999)

    Chapter  Google Scholar 

  13. Pinker, S.: Learnability and Cognition: The Acquisition of Argument Structure. MIT Press, Cambridge (1989)

    Google Scholar 

  14. Rietveld, T., van Hout, R.: Statistical Techniques for the Study of Language and Language Behavior. Mouton de Gruyter, Berlin (1993)

    Google Scholar 

  15. R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing (2008)

    Google Scholar 

  16. Sheather, S.: A Modern Approach to Regression with R. Springer, New York (2009)

    MATH  Google Scholar 

  17. Siewierska, A., Hollmann, W.: Ditransitive clauses in English with special reference to Lancashire dialect. In: Hannay, M., van der Steen, G.J. (eds.) Structural-functional Studies in English Grammar: In Honor of Lachlan Mackenzie, pp. 83–102. John Benjamins, Amsterdam (2007)

    Google Scholar 

  18. West, B.T., Welch, K.B., Gałecki, A.T.: Linear Mixed Models: A practical guide using statistical software. Chapman & Hall/CRC, Boca Raton (2007)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Theijssen, D. (2010). Variable Selection in Logistic Regression: The British English Dative Alternation. In: Icard, T., Muskens, R. (eds) Interfaces: Explorations in Logic, Language and Computation. ESSLLI ESSLLI 2008 2009. Lecture Notes in Computer Science(), vol 6211. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14729-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14729-6_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14728-9

  • Online ISBN: 978-3-642-14729-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics