Skip to main content

Using Very Large Parsed Corpora and Judgment Data to Classify Verb Reflexivity

  • Conference paper
Anaphora: Analysis, Algorithms and Applications (DAARC 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4410))

Included in the following conference series:

Abstract

Dutch has two reflexive pronouns, zich and zichzelf. When is each one used? This question has been debated in the literature on binding theory, reflexives and anaphora resolution. Partial solutions have attempted to use syntactic binding domains, semantic features and pragmatic concepts such as focus to predict reflexive choice, but until now no experimental data either in favor of or against one of these theories is available. In this paper we look at reflexive choice on the basis of empirical data: a large scale corpus study and an online questionnaire. On the basis of the results of both experiments, we are able to predict the choice between the two reflexive items in Dutch without assuming a distinction between verbs that occur with zich or zichzelf a priori (cf. a distinction in terms like ‘inherent reflexivity’ (Reinhart and Reuland, 1993)). Instead, we examine the distribution of zich and zichzelf using the Clef corpus, a 70 million word Very Large Corpus of Dutch. The corpus is tagged and parsed. This allows us to identify the typical action the verbs are used to describe: reflexive or non-reflexive actions. Regression analysis shows that, by doing so, 21% of the distribution of the two reflexive items in Dutch can be predicted. Using the verb reflexivity found in the corpus study even allows us to explain 83% of the participants’ choices in the online study between zich and zichzelf. As such, both the corpus study and the online questionnaire confirm the group of verbs called ‘inherent reflexive verbs’ without postulating the group beforehand. We further discovered that even inherently reflexive verbs, which are argued to never co-occur with zichzelf, sometimes had zichzelf chosen as the preferred argument in the questionnaire, and to a lesser degree, in the corpus suggesting that the verb classes are tendential and not categorical.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Anderson, M.: Noun phrase structure. Unpublished doctoral dissertation, MIT, Cambridge, MA (1979)

    Google Scholar 

  • Bouma, G., van Noord, G., Malouf, R.: Alpino: Wide-coverage computational analysis of Dutch. In: In Computational Linguistics in The Netherlands 2000, Rodopi, Amsterdam (2001)

    Google Scholar 

  • Broekhuis, H.: The referential properties of noun phrases I, 2nd edn. Modern grammar of Dutch occasional papers 1, University of Tilburg (2004)

    Google Scholar 

  • Everaert, M.: The syntax of reflexivization. Foris Publications, Dordrecht (1987)

    Google Scholar 

  • Geurts, B.: Weak and Strong Reflexives in Dutch. In: Schlenker, P., Keenan, E. (eds.) Proceedings of the ESSLLI workshop on semantic approaches to binding theory (2004)

    Google Scholar 

  • Gleason, H.: Linguistics and English Grammar. Holt, Rinehart and Winston, New York (1965)

    Google Scholar 

  • Haeseryn, W., et al.: Algemene Nederlandse Spraakkunst (Second, totally revised version of 2002). Martinus Nijhoff, Groningen (2002)

    Google Scholar 

  • Jakubowicz, C.: Sig en danois: syntaxe et acquisition. In: Obenauer, H.-G., Zribi-Hertz, A. (eds.) Structure de la phrase et thêorie du liage, Presses Universitaires de Vincennes, Saint Denis, pp. 121–149. Presses Universitaires de Vincennes, Saint Denis (1992)

    Google Scholar 

  • Jesperson, P.: Essentials of English grammar. Allen and Unwin, London (1933)

    Google Scholar 

  • Jijkoun, V., Mishne, G., de Rijke, M.: Preprocessing Documents to Answer Dutch Questions. In: Proceedings 15th Belgian-Dutch Conference on Artificial Intelligence (BNAIC’03) (2003)

    Google Scholar 

  • Keenan, E.: On semantics and the binding theory. In: Edwards, J. (ed.) Explain language universals, Blackwell, Oxford (1988)

    Google Scholar 

  • Lekakou, M.: Reflexives in contexts of reduced valency: German vs Dutch. In: den Dikken, M., Tortora, C.M. (eds.) The Function of Function Words and Functional Categories, John Benjamins, Amsterdam (2005)

    Google Scholar 

  • Lidz, J.: Condition R. Linguistic Inquiry 32(1), 123–140 (2001)

    Article  Google Scholar 

  • Partee, B., Bach, E.: Quantification, pronouns, and VP-anaphora. In: Formal methods in the study of language. Mathematisch Centrum, Amsterdam University (1981)

    Google Scholar 

  • Reinhart, T., Reuland, E.: Reflexivity. Linguistic Inquiry 24, 656–720 (1993)

    Google Scholar 

  • Reuland, E., Koster, J.: Long-distance anaphora: an overview. In: Koster, J., Reuland, E. (eds.) Long-distance anaphora, pp. 1–25. Cambridge University Press, Cambridge (1991)

    Google Scholar 

  • Zubizarreta, M.L.: Levels of Representation in the Lexicon and in the Syntax. Foris Publications, Dordrecht (1987)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

António Branco

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Smits, EJ., Hendriks, P., Spenader, J. (2007). Using Very Large Parsed Corpora and Judgment Data to Classify Verb Reflexivity. In: Branco, A. (eds) Anaphora: Analysis, Algorithms and Applications. DAARC 2007. Lecture Notes in Computer Science(), vol 4410. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71412-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71412-5_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71411-8

  • Online ISBN: 978-3-540-71412-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics