Abstract
Anaphora Resolution (AR) has attracted the attention of many researchers because of its relevance to Machine Translation, Information Retrieval, Text Summarization and many other applications. AR is a complicated problem in NLP especially in Semitic languages because of their complex morphological structure. Anaphora can be defined as a linguistic relation between two textual entities which is determined when a textual entity (the anaphor) refers to another entity of the text which usually occurs before it (the antecedent). The process of determining the antecedent of an anaphor is referred to as anaphora resolution. In this chapter, we present an account of the anaphora resolution task. The chapter consists of ten sections. The first section is an introduction to the problem. In the second section, we present different types of anaphora. Section 3 discusses the determinants and factors to anaphora resolution and its effect on increasing system performance. In section 4, we discuss the process of anaphora resolution. In section 5 we present different approaches to resolving anaphora and we discuss previous work in the field. Section 6 discusses the recent work in anaphora resolution, and section 7 discusses an important aspect in the anaphora resolution process which is the evaluation of AR systems. In sections 8 and 9, we focus on the anaphora resolution in Semitic languages in particular and the difficulties and challenges facing researchers. Finally, section 10 presents a summary of the chapter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
ISA hierarchy, also called “is a” relationship, is an arrangement of items or objects in which the above item is represented as being the parent item for its derived items, and the derived items are represented as children for the above item. In Object Oriented, it means attributes inherited; i.e., if we declare A ISA B, every A entity is also considered to be a B entity. For example: if we have class A = {person} and class B = {male, female}. if B isa A, every entity in B is A, which means every male and female is a person.
- 2.
The set of mentions contained in the gold standard, produced by a human expert, are referred to as TRUE or GOLD mentions, as opposed to the set of mentions contained in the system output, which are called SYSTEM or SYS mentions. Gold standard annotation is correctly identifying all NPs that are part of coreference chains.
References
Asher N, Lascarides A (2003) Logics of conversation. Cambridge University Press, Cambridge/New York
Bagga A, Baldwin B (1998) Algorithms for scoring coreference chains. In: Proceedings of the linguistic coreference workshop at the first international conference on language resources and evaluation (LREC’98), Granada, pp 563–566
Baldwin B (1997) CogNIAC: high precision coreference with limited knowledge and linguistic resources. Proceedings of the ACL’97/EACL’97 workshop on operational factors in practical, robust anaphora resolution, Madrid, pp 38–45
Baldwin B, Morton T, Bagga A, Baldridge J, Chandraseker R, Dimitriadis A, Snyder K, Wolska M (1998) Description of the UPENN CAMP system as used for coreference. In: Proceedings of message understanding conference (MUC-7), Fairfax
Carbonell JG, Brown RD (1998) Anaphora resolution: a multi-strategy approach. In: COLING’88 proceedings of the 12th conference on computational linguistics, Budapest, vol 1. Association for Computational Linguistics, pp 96–101
Carter DM (1986) A shallow processing approach to anaphor resolution. PhD thesis, University of Cambridge
Carter DM (1987) A shallow processing approach to anaphor resolution. Computer Laboratory, University of Cambridge
Chomsky N (1981) Lectures on government and binding. Foris Publications, Dordrecht/Cinnaminson
Chomsky N (1986) Knowledge of language: its nature, origin and use. Greenwood Publishing Group, USA
Dagan I, Itai A (1990) Automatic processing of large corpora for the resolution of anaphora references. In: Proceedings of the 13th international conference on computational linguistics (COLING’90), Helsinki, vol 3, pp 1–3
Dagan I, Itai A (1991) A statistical filter for resolving pronoun peferences. In: YA Feldman, A Bruckstein (eds) Artificial intelligence and computer vision. Elsevier, Burlington, pp 125–135
Doddington G, Mitchell A, Przybocki M, Ramshaw L, Strassel S, Weischedel R (2004) The automatic content extraction (ace) program tasks, data, and evaluation. In: NIST, Lisbon, pp. 837–840
Farghaly A (1981) Topic in the syntax of Egyptian Arabic. PhD dissertation, University of Texas at Austin, Austin
Farghaly A (2010) Arabic computational linguistics. CSLI Publications, Center for the Study of Language and Information, Stanford
Farghaly A, Shaalan Kh (2009) Arabic natural language processing: challenges and solutions. ACM Trans Asian Lang Inf Process 8(4):1–22. Article 14
Grishman R, Sundheim B (1996) Message understanding conference – 6: a brief history. In: Proceedings of the 16th international conference on computational linguistics (COLING), Kopenhagen, vol I, pp 466–471
Hammami S, Belguith L, Ben Hamadou A (2009) Arabic anaphora resolution: corpora annotation with coreferential links. Int Arab J Inf Technol 6:480–488
Hobbs JR (1976) Pronoun resolution. Research report. Department of Computer Science, University of New York, New York, pp 76–1
Hobbs JR (1978) Resolving pronoun references. Lingua 44:339–352
Kameyama M (1997) Recognizing referential links: an information extraction perspective. In: Proceedings of the ACL’97/EACL’97 workshop on operational factors in practical, robust anaphora resolution, Madrid, pp 46–53
Kennedy Ch, Boguraev B (1996) Anaphora for everyone: pronominal anaphora resolution without a parser. In: Proceedings of the 16th international conference on computational linguistics (COLING’96), Copenhagen, pp 113–118
Lappin Sh, Leass H (1994) An algorithm for pronominal anaphora resolution. Comput Linguist 20(4):535–561
Luo X (2005) On coreference resolution performance metrics. In: Proceedings of the conference on human language technology and empirical methods in natural language processing (HLT’05), Vancouver. Association for Computational Linguistics, Stroudsburg, pp 25–32. http://dl.acm.org/citation.cfm?id=1220579
Luo X, Ittycheriah A, Jing H, Kambhatla N, Roukos S (2004) A mention synchronous coreference resolution algorithm based on the bell tree. In: Proceedings of ACL’04, Barcelona
Mitkov R (1994) An integrated model for anaphora resolution. In: Proceedings of the 15th conference on computational linguistics (COLING’94), Stroudsburg, vol 2. Association for Computational Linguistics, pp 1170–1176
Mitkov R (1995) An uncertainty reasoning approach for anaphora resolution. In: Proceedings of the natural language processing pacific rim symposium (NLPRS’95), Seoul, pp 149–154
Mitkov R (1996) Anaphora resolution: a combination of linguistic and statistical approaches. In: Proceedings of the discourse anaphora and anaphor resolution (DAARC’96), Lancaster
Mitkov R (1997) Factors in anaphora resolution: they are not the only things that matter. A case study based on two different approaches. In: Proceedings of the ACL’97/EACL’97 workshop on operational factors in practical, robust anaphora resolution, Madrid, pp 14–21
Mitkov R (1998a) Evaluating anaphora resolution approaches. In: Proceedings of the discourse anaphora and anaphora resolution colloquium (DAARC’2), Lancaster
Mitkov R (1998b) Robust pronoun resolution with limited knowledge. In: Proceedings of the 18th international conference on computational linguistics (COLING’98)/ACL’98 conference, Montreal, pp 869–875
Mitkov R (1999) Anaphora resolution: the state of the art. Technical report based on COLING’98 and ACL’98 tutorial on anaphora resolution, School of Languages and European Studies, University of Wolverhampton
Mitkov R, Belguith L, Stys M (1998) Multilingual robust anaphora resolution. In: Proceedings of the third international conference on empirical methods in natural language processing (EMNLP-3), Granada, pp 7–16
Mitkov R, Lappin S, Boguraev B (2001) Introduction to the special issue on computational anaphora resolution. MIT, Cambridge, pp 473–477
Nasukawa T (1994) Robust method of pronoun resolution using full-text information. In: Proceedings of the 15th international conference on computational linguistics (COLING’94), Kyoto, pp 1157–1163
Ng V (2003) Machine learning for coreference resolution: recent successes and future challenges. Technical report cul.cis/tr2003-1918, Cornell University
Ng V, Cardie C (2002) Improving machine learning approaches to coreference resolution. In: Proceedings of the 40th annual meeting of the association for computational linguistics (ACL), Philadelphia, pp 104–111
NIST (2003a) The ACE evaluation plan. www.nist.gov/speech/tests/ace/index.htm
NIST (2003b) Proceedings of ACE’03 workshop, Adelaide. Booklet, Alexandria
Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850
Recasens M, Hovy EH (2010) BLANC: implementing the rand index for coreference evaluation. Nat Lang Eng 17:485–510
Rich E, LuperFoy S (1988) An architecture for anaphora resolution. In: Proceedings of the second conference on applied natural language processing (ANLP-2), Austin, pp 18–24
Seddik MK, Farghaly A, Fahmy A (2011) Arabic anaphora resolution in Holy Qur’an text. In: Proceedings of ALTIC 2011 conference on Arabic language technology, Alexandria, pp 21–28
Sidner CL (1979) Towards a computational theory of definite anaphora comprehension in English discourse. Technical report No. 537. MIT, Artificial Intelligence Laboratory
Soon W, Ng H, Lim D (2001) A machine learning approach to coreference resolution of noun phrases. Comput Linguist 27(4):521–544
Vilain M et al (1995) A model-theoretic coreference scoring scheme. In: Proceedings of the sixth message understanding conference (MUC-6), Columbia, pp 45–52
Williams S, Harvey M, Preston K (1996) Rule-based reference resolution for unrestricted text using part-of-speech tagging and noun phrase parsing. In: Proceedings of the international colloquium on discourse anaphora and anaphora resolution (DAARC), Lancaster, pp 441–456
Yang X, Zhou G, Su J, Tan CL (2003) Coreference resolution using competition learning approach. In: ACL’03: proceedings of the 41st annual meeting on Association for Computational Linguistics, pp 176–183
Yang X, Su J, Tan CL (2008) A twin-candidate model for learning-based anaphora resolution. Comput Linguist 34(3):327–356. Iida, R
Zitouni I, Sorensen J, Luo X, Florian R (2005) The impact of morphological stemming on Arabic mention detection and coreference resolution. In: Proceedings of the ACL workshop on computational approaches to Semitic languages, 43rd annual meeting of the association of computational linguistics (ACL2005), Ann Arbor, pp 63–70
Zitouni I, Luo X, Florian R (2010) A statistical model for Arabic mention detection and chaining. In: Farghaly A (ed) Arabic computational linguistics. CSLI Publications, Center for the Study of Language and Information, Stanford, pp 199–236
Further Reading
Asher N, Lascarides A (2003) Logics of conversation. Cambridge University Press, Cambridge/New York
Bengtson E, Roth D (2008) Understanding the value of features for coreference resolution. In: Proceedings of the 2008 conference on empirical methods in natural language processing, Honolulu, pp 294–303
Boguraev B, Christopher K (1997) Salience-based content characterisation of documents. In: Proceedings of the ACL’97/EACL’97 workshop on intelligent scalable text summarisation, Madrid, pp 3–9
Cai J, Strube M (2010) End-to-end coreference resolution via hypergraph partitioning. In: Proceedings of the 23rd international conference on computational linguistics, Beijing, 23–27 Aug 2010, pp 143–151
Chen B, Su J, Tan CL (2010) A twin-candidate based approach for event pronoun resolution using composite Kernel. In: Proceedings of the COLING 2010, Beijing, pp 188–196
Diab M, Hacioglu K, Jurafsky D (2004) Automatic tagging of Arabic text: from raw text to base phrase chunks. In: Dumas S, Marcus D, Roukos S (eds) HLT-NAACL 2004: short papers. Association for Computational Linguistics, Boston, pp 140–152
Elghamry K, El-Zeiny N, Al-Sabbagh R (2007) Arabic anaphora resolution using the web as corpus. In: Proceedings of the 7th conference of language engineering, the Egyptian society of language engineering, Cairo, pp 294–318
Gasperin C (2009) Statistical anaphora resolution in biomedical texts. Technical report, Computer Laboratory, University of Cambridge
Ng V (2010) Supervised noun phrase coreference research: the first fifteen years. In: Proceedings of the 48th annual meeting of the association for computational linguistics, Uppsala, pp 1396–1411
Noklestad A (2009) A machine learning approach to anaphora resolution including named entity recognition, PP attachment disambiguation, and animacy detection. PhD, Faculty of Humanities, The University of Oslo, Norway, p 298
Recasens M, Mart T, Taul M, Marquez L, Sapena E (2010) SemEval-2010 task 1: coreference resolution in multiple languages. In: Proceedings of the NAACL HLT workshop on semantic evaluations: recent achievements and future directions, Los Angeles, pp 70–75
Wintner S (2004) Hebrew computational linguistics: past and future. Artif Intell Rev 21(2):113–138
Zhao Sh, Ng HT (2010) Maximum metric score training for coreference resolution. In: Proceedings of the 23rd international conference on computational linguistics (COLING’10), Stroudsburg, pp 1308–1316
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Seddik, K.M., Farghaly, A. (2014). Anaphora Resolution. In: Zitouni, I. (eds) Natural Language Processing of Semitic Languages. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45358-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-45358-8_8
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45357-1
Online ISBN: 978-3-642-45358-8
eBook Packages: Computer ScienceComputer Science (R0)