Abstract
In this paper, we argue that difficulties in the definition of coreference itself contribute to lower inter-annotator agreement in certain cases. Data from a large referentially annotated corpus serves to corroborate this point, using a quantitative investigation to assess which effects or problems are likely to be the most prominent. Several examples where such problems occur are discussed in more detail, and we then propose a generalisation of Poesio, Reyle and Stevenson’s Justified Sloppiness Hypothesis to provide a unified model for these cases of disagreement and argue that a deeper understanding of the phenomena involved allows to tackle problematic cases in a more principled fashion than would be possible using only pre-theoretic intuitions.
Similar content being viewed by others
References
Asher N. (2006) Things and their aspects. Philosophical Issues 16(1): 1–23
Asher N., Lascarides A. (2003) Logics of conversation. Cambridge University Press, Cambridge
Asher, N., & Pustejovsky, J. (2005). Word meaning and commonsense metaphysics. http://semanticsarchive.net/Archive/TgxMDNkM/.
Burchardt, A., Erk, K., Frank, A., Kowalski, A., Padó, S., & Pinkal, M. (2006). The SALSA Corpus: A German corpus resource for lexical semantics. In Proceedings of LREC 2006.
Cahill, A., McCarthy, M., van Genabith, J., & Way, A. (2002). Parsing with PCFGs and automatic F-structure annotation. In Proceedings of the Seventh International Conference on LFG. CSLI Publications.
Carletta J. (1996) Assessing agreement on classification tasks: The kappa statistic. Computational Linguistics 22(2): 249–254
Castaño, J., Zhang, J., & Pustejovsky, J. (2002). Anaphora resolution in biomedical literature. In International Symposium on Reference Resolution.
Charniak, E., & Johnson, M. (2005). Coarse-to-fine n-best parsing and maxEnt discriminative reranking. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005).
Chiarcos, C., & Krasavina, O. (2005). PoCoS—Potsdam Coreference Scheme. Technical report, SFB 632 “Information structure: The linguistic means for structuring utterances, sentences and texts”.
Cicchetti D.V., Feinstein A.R. (1990) High agreement but low kappa: II. Resolving the paradoxes. Journal of Clinical Epidemiology 43(6): 551–558
Cohen J. (1960) A coefficient of agreement for nominal scales. Education and Psychological Measurement 43(6): 37–46
Dickinson, M., & Meurers, W. D. (2005). Prune diseased branches to get healthy trees! How to find erroneous local trees in a treebank and why it matters. In Proceedings of the Fourth Workshop on Treebanks and Linguistic Theories (TLT 2005). Barcelona, Spain.
Di Eugenio B., Glass M. (2004) The Kappa statistic: A second look. Computational Linguistics 30(1): 95–101
Fauconnier, G. (1984). Espaces Mentaux. Editions de Minuit.
Gardent C., Manuélian H. (2005) Création d’un corpus annoté pour le traitement des déscriptions d éfinies. Traitement Automatique des Langues 46(1): 115–140
Hinrichs, E., Kübler, S., & Naumann, K. (2005). A unified representation for morphological, syntactic, semantic and referential annotations. In ACL Workshop on Frontiers in Corpus Annotation II: Pie in the Sky. Ann Arbor.
Hirschman, L., Robinson, P., Burger, J., & Vilain, M. (1997). Automating coreference: The role of automated training data. In Proceedings of AAAI Spring Symposium on Applying Machine Learning to Discourse Processing.
Hobbs, J. (1985). Granularity. In Proceedings IJCAI 1985.
Hockenmaier, J., & Steedman, M. (2002). Acquiring compact lexicalized grammars from a cleaner treebank. In Proceedings LREC 2002.
Hoste, V., & Daelemans, W. (2004). Learning Dutch coreference resolution. In Fifteenth Computational Linguistics in the Netherlands Meeting (CLIN 2004).
Hripcsak G., Rothschild A.S. (2005) Agreement, the F-measure, and reliability in information retrieval. Journal of the American Medical Informatics Association 12: 296–298
Karttunen, L. (1976). Discourse Referents. In J. D. McCawley (Ed.), Syntax and semantics 7: Notes from the linguistic underground (pp. 363–385). Academic Press.
Kintsch W., van Dijk T. (1978) Toward a model of text comprehension and production. Psychological Review 85: 363–394
Knees, M. (2006). The German temporal anaphor danach—ambiguity in interpretation and annotation. In ESSLLI 2006 workshop on Ambiguity and Anaphora.
Link G. (1983) The logical analysis of plurals and mass terms: A lattice-theoretical approach. In: Bäuerle R., Schwarze C., Stechow A. (eds) Meaning, use and interpretation of language. de Gruyter, NY, USA
Luo, X., Ittycheriah, A., Jing, H., Kambhatla, N., & Roukos, S. (2004). A mention-synchronous coreference resolution algorithm based on the bell tree. In ACL 2004.
Magerman, D. M. (1995). Statistical decision-tree models for parsing. In ACL’1995.
Mani I. (1998) A theory of granularity and its application to problems of polysemy and underspecification of meaning. In: Cohn A.G., Schubert L.K., Shapiro S.C. (eds) Principles of Knowledge Representation and Reasoning: Proceedings of the Sixth Internatinal Conference (KR’98). Morgan Kaufmann, San Mateo, Menlo Park, pp 245–255
McCarthy, J. F., & Lehnert, W. G. (1995). Using decision trees for coreference resolution (pp. 1050–1055). In IJCAI 1995.
Meurers W.D. (2005) On the use of electronic corpora for theoretical linguistics. Case studies from the syntax of German. Lingua 115(11): 1619–1639
Miyao, Y., & Tsujii, J. (2005). Probabilistic disambiguation models for wide-coverage HPSG parsing. In ACL 2005.
MUC6. (1995). MUC-6 coreference task definition. DARPA Information Technology Office Tipster Text Program.
Passonneau, R. (1997). Applying reliability metrics to co-reference annotation. Technical Report CUCS-025-03, Columbia University.
Poesio, M. (2000). The GNOME annotation scheme manual. Technical report, University of Edinburgh, HCRC and Informatics.http://www.hcrc.ed.ac.uk/~gnome.
Poesio, M. (2004). The MATE/GNOME scheme for anaphoric annotation, revisited. In Proceedings of SIGDIAL’04. Boston.
Poesio, M., & Artstein, R. (2005). Annotating (Anaphoric) ambiguity. In Corpus Linguistics 2005. Birmingham.
Poesio, M., & Reyle, U. (2001). Underspecification in anaphoric reference. In Fourth International Workshop on Computational Semantics (IWCS-4).
Poesio, M., Reyle, U., & Stevenson, R. (2003). Justified sloppiness in anaphoric reference. In H. Bunt & R. Muskens (Eds.), Computing meaning 3. Dordrecht: Kluwer. (To appear).
Poesio M., Sturt P., Artstein R., Filik R. (2006) Underspecification and anaphora: Theoretical issues and preliminary evidence. Discource Processes 42(2): 152–175
Reitsma, F., & Bittner, T. (2003). Process, hierarchy and scale. In W. Kuhn, M. Worboys & S. Timpf (Eds.), Spatial information theory. Cognitive and computational foundations of geographic information science (COSIT’03).
Setzer, A., & Gaizauskas, R. (2001). A pilot study on annotating temporal relations in text. In ACL 2001 Workshop on Temporal and Spatial Information Processing.
Smith B., Brogaard B. (2001) A unified theory of truth and reference. Logique et Analyse 43(169–170): 49–93
Strassel, S., Walker, C., & Mitchell, A. (2004). Annotation consistency study. Slides found athttp://projects.ldc.upenn.edu/ace/workshops/Feb2004.html.
Uryupina, O. (2006). Coreference resolution with and without linguistic knowledge. In Proceedings of LREC 2006.
van Deemter K., Kibble R. (2000) On coreferring: Coreference in MUC and related annotation schemes. Computational Linguistics 26(4): 629–637
van Rijsbergen, C. J. K. (1979). Information retrieval. Butterworths.
Vilain, M., Burger, J., Aberdeen, J., Connolly, D., & Hirschman, L. (1995). A model-theoretic coreference scoring scheme. In Proceedings of the 6th Message Understanding Conference.
Zaenen, A., Carletta, J., Garretson, G., Bresnan, J., Koontz-Garboden, A., Nikitana, T., O’Connor, M. C., & Wasow, T. (2004). Animacy encoding in English: why and how. In ACL 2004 Workshop on Discourse Annotation.
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Versley, Y. Vagueness and Referential Ambiguity in a Large-Scale Annotated Corpus. Res on Lang and Comput 6, 333–353 (2008). https://doi.org/10.1007/s11168-008-9059-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11168-008-9059-1