Skip to main content

Advanced Machine Learning Models for Coreference Resolution

  • Chapter
  • First Online:
Anaphora Resolution

Abstract

Despite being the most influential learning-based coreference model, the mention-pair model is unsatisfactory from both a linguistic perspective and a modeling perspective: its focus on making local coreference decisions involving only two mentions and their contexts makes it even less expressive than the coreference systems developed in the pre-statistical NLP era. Realizing its weaknesses, researchers have developed many advanced coreference models over the years. In particular, there is a gradual shift from local models towards global models, which seek to address the weaknesses of local models by exploiting additional information beyond that of the local context. In this chapter, we will discuss these advanced models for coreference resolution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In many existing coreference resolvers, a mention is typically considered a name alias of another mention if one is an abbreviation or an acronym of the other.

  2. 2.

    www.ldc.upenn.edu

  3. 3.

    Note that only mention boundaries are used.

  4. 4.

    Available from http://crfpp.sourceforge.net

  5. 5.

    For this and subsequent uses of the SVM learner in their experiments, Rahman and Ng set all the learning parameters to their default values.

  6. 6.

    Rahman and Ng used Approximate Randomization [42] for testing statistical significance, with p set to 0.05.

  7. 7.

    The correct partition will receive a perfect score, of course.

  8. 8.

    Note that the model proposed by Daumé III and Marcu is a model for jointly performing mention detection and coreference resolution. In our discussion, we focus on the portion of their model that is relevant to learning a coreference partition. See their paper [11] for details.

References

  1. Alshawi, H., Carter, D., van Eijck, J., Moore, R., Moran, D., Pulman, S.: Overivew of the Core Language Engine. In: Proceedings of the International Conference on Fifth Generation Computer Systems, Tokyo, pp. 1108–1115 (1988)

    Google Scholar 

  2. Bagga, A., Baldwin, B.: Entity-based cross-document coreferencing using the vector space model. In: Proceedings of the 36th Annual Meeting of the Assocation for Computational Linguistics and the 17th International Conference on Computational Linguistics, Montreal, pp. 79–85 (1998)

    Google Scholar 

  3. Bikel, D.M., Schwartz, R., Weischedel, R.M.: An algorithm that learns what’s in a name. Mach. Learn.: Spec. Issue Nat. Lang. Learn. 34 (1–3), 211–231 (1999)

    Google Scholar 

  4. Brennan, S.E., Friedman, M.W., Pollard, C.J.: A centering approach to pronouns. In: Proceedings of the 25th Annual Meeting of the Association for Computational Linguistics, Stanford, pp. 155–162 (1987)

    Google Scholar 

  5. Carter, D.M.: Interpreting Anaphors in Natural Language Texts. Ellis Horwood, Chichester (1987)

    Google Scholar 

  6. Collins, M.: Discriminative training methods for Hidden Markov Models: theory and experiments with perceptron algorithms. In: Proceedings of the 2002 Empirical Methods in Natural Language Processing, Prague, pp. 1–8 (2002)

    Google Scholar 

  7. Connolly, D., Burger, J.D., Day, D.S.: A machine learning approach to anaphoric reference. In: Proceedings of International Conference on New Methods in Language Processing, New Brunswick, pp. 255–261 (1994)

    Google Scholar 

  8. Connolly, D., Burger, J.D., Day, D.S.: A machine learning approach to anaphoric reference. In: New Methods in Language Processing, pp. 133–144. UCL Press, London (1997)

    Google Scholar 

  9. Crammer, K., Singer, Y.: Ultraconservative online algorithms for multiclass problems. J. Mach. Learn. Res. 3, 951–991 (2003)

    MathSciNet  MATH  Google Scholar 

  10. Culotta, A., Wick, M., McCallum, A.: First-order probabilistic models for coreference resolution. In: Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, Rochester, pp. 81–88 (2007)

    Google Scholar 

  11. Daumé III, H., Marcu, D.: A large-scale exploration of effective global features for a joint entity detection and tracking model. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, pp. 97–104 (2005)

    Google Scholar 

  12. Daumé III, H., Marcu, D.: Learning as search optimization: approximate large margin methods for structured prediction. In: Proceedings of the 22nd International Conference on Machine Learning, Bonn, pp. 169–176 (2005)

    Google Scholar 

  13. Denis, P., Baldridge, J.: A ranking approach to pronoun resolution. In: Proceedings of the Twentieth International Conference on Artificial Intelligence, Hyderabad, pp. 1588–1593 (2007)

    Google Scholar 

  14. Denis, P., Baldridge, J.: Specialized models and ranking for coreference resolution. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, Honolulu, pp. 660–669 (2008)

    Google Scholar 

  15. Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT, Cambridge, MA (1998)

    MATH  Google Scholar 

  16. Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, Ann Arbor, pp. 363–370 (2005)

    Google Scholar 

  17. Finley, T., Joachims, T.: Supervised clustering with support vector machines. In: Proceedings of the 22nd International Conference on Machine Learning, Bonn, pp. 217–224 (2005)

    Google Scholar 

  18. Gaizauskas, R., Wakao, T., Humphreys, K., Cunningham, H., Wilks, Y.: Description of the LaSIE ystem as used for MUC-6. In: Proceedings of the Sixth Message Understanding Conference (MUC-6), pp. 207–220. Morgan Kaufmann, Columbia (1995)

    Google Scholar 

  19. Ge, N., Hale, J., Charniak, E.: A statistical approach to anaphora resolution. In: Proceedings of the Sixth Workshop on Very Large Corpora, Montreal, pp. 161–170 (1998)

    Google Scholar 

  20. Haghighi, A., Klein, D.: Unsupervised coreference resolution in a nonparametric bayesian model. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, pp. 848–855 (2007)

    Google Scholar 

  21. Haghighi, A., Klein, D.: Coreference resolution in a modular, entity-centered model. In: Proceedings of Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 385–393 (2010)

    Google Scholar 

  22. Heim, I.: The semantics of definite and indefinite noun phrases. Ph.D. thesis, University of Massachusetts at Amherst, Amherst (1982)

    Google Scholar 

  23. Hobbs, J.: Resolving pronoun references. Lingua 44, 311–338 (1978)

    Article  Google Scholar 

  24. Iida, R., Inui, K., Takamura, H., Matsumoto, Y.: Incorporating contextual cues in trainable models for coreference resolution. In: Proceedings of the EACL Workshop on the Computational Treatment of Anaphora, Budapest (2003)

    Google Scholar 

  25. Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, pp. 133–142 (2002)

    Google Scholar 

  26. Kabadjov, M.A.: Task-oriented evaluation of anaphora resolution. Ph.D. thesis, University of Essex, Colchester (2007)

    Google Scholar 

  27. Kamp, H.: A theory of truth and semantic interpretation. In: Formal Methods in the Study of Language. Mathematical Centre, Amsterdam (1981)

    Google Scholar 

  28. Karttunen, L.: Discourse referents. In: Syntax and Semantics 7 – Notes from the Linguistic Underground, pp. 363–385. Academic, New York/London (1976)

    Google Scholar 

  29. Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of 18th International Conference on Machine Learning, Williamstown, pp. 282–289 (2001)

    Google Scholar 

  30. Lappin, S., Leass, H.: An algorithm for pronominal anaphora resolution. Comput. Linguist. 20 (4), 535–562 (1994)

    Google Scholar 

  31. Lee, H., Chang, A., Peirsman, Y., Chambers, N., Surdeanu, M., Jurafsky, D.: Deterministic coreference resolution based on entity-centric, precision-ranked rules. Comput. Linguist. 39 (4), 885–916 (2013)

    Article  Google Scholar 

  32. Luo, X.: On coreference resolution performance metrics. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, pp. 25–32 (2005)

    Google Scholar 

  33. Luo, X., Ittycheriah, A., Jing, H., Kambhatla, N., Roukos, S.: A mention-synchronous coreference resolution algorithm based on the Bell tree. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, pp. 135–142 (2004)

    Google Scholar 

  34. LuperFoy, S.: The representation of multimodal user interface dialogues using discourse pegs. In: Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, Newark, pp. 22–31 (1992)

    Google Scholar 

  35. McCallum, A., Wellner, B.: Toward conditional models of identity uncertainty with application to proper noun coreference. In: Proceedings of the IJCAI Workshop on Information Integration on the Web Acapulco, (2003)

    Google Scholar 

  36. McCallum, A., Wellner, B.: Conditional models of identity uncertainty with application to noun coreference. In: Advances in Neural Information Proceesing Systems. MIT, Cambridge (2004)

    Google Scholar 

  37. MUC-6: Proceedings of the Sixth Message Understanding Conference. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  38. MUC-7: Proceedings of the Seventh Message Understanding Conference. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  39. Ng, V.: Supervised ranking for pronoun resolution: some recent improvements. In: Proceedings of the 20th National Conference on Artificial Intelligence, Edinburgh, pp. 1081–1086 (2005)

    Google Scholar 

  40. Ng, V.: Unsupervised models for coreference resolution. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, Honolulu, pp. 640–649 (2008)

    Google Scholar 

  41. Ng, V., Cardie, C.: Identifying anaphoric and non-anaphoric noun phrases to improve coreference resolution. In: Proceedings of the 19th International Conference on Computational Linguistics, Taipei, pp. 730–736 (2002)

    Google Scholar 

  42. Noreen, E.W.: Computer Intensive Methods for Testing Hypothesis: An Introduction. Wiley, New York (1989)

    Google Scholar 

  43. Poesio, M., Kabadjov, M.A.: A general-purpose, off-the-shelf anaphora resolution module: implementation and preliminary evaluation. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation, Lisbon, pp. 663–666 (2004)

    Google Scholar 

  44. Poon, H., Domingos, P.: Joint unsupervised coreference resolution with Markov Logic. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, Honolulu, pp. 650–659 (2008)

    Google Scholar 

  45. Prince, E.: Toward a taxonomy of given-new information. In: Cole, P. (ed.) Radical Pragmatics, pp. 223–255. Academic, New York (1981)

    Google Scholar 

  46. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  47. Raghunathan, K., Lee, H., Rangarajan, S., Chambers, N., Surdeanu, M., Jurafsky, D., Manning, C.: A multi-pass sieve for coreference resolution. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Boston, pp. 492–501 (2010)

    Google Scholar 

  48. Rahman, A., Ng, V.: Supervised models for coreference resolution. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, pp. 968–977 (2009)

    Google Scholar 

  49. Rahman, A., Ng, V.: Ensemble-based coreference resolution. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, pp. 1884–1889 (2011)

    Google Scholar 

  50. Recasens, M.: Coreference: Theory, annotation, resolution and evaluation. Ph.D. thesis, University of Barcelona, Barcelona (2010)

    Google Scholar 

  51. Sidner, C.: Towards a computational theory of definite anaphora comprehension in English discourse. Ph.D. thesis, Massachusetts Institute of Technology (1979)

    Google Scholar 

  52. Soon, W.M., Ng, H.T., Lim, D.C.Y.: A machine learning approach to coreference resolution of noun phrases. Comput. Linguist. 27 (4), 521–544 (2001)

    Article  Google Scholar 

  53. Stoyanov, V., Gilbert, N., Cardie, C., Riloff, E.: Conundrums in noun phrase coreference resolution: making sense of the state-of-the-art. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Singapore, pp. 656–664 (2009)

    Google Scholar 

  54. Strube, M., Hahn, U.: Functional centering – grounding referential coherence in information structure. Comput. Linguist. 25 (3), 309–344 (1999)

    Google Scholar 

  55. Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, Edmonton, pp. 173–180 (2003)

    Google Scholar 

  56. Versley, Y., Ponzetto, S.P., Poesio, M., Eidelman, V., Jern, A., Smith, J., Yang, X., Moschitti, A.: BART: a modular toolkit for coreference resolution. In: Proceedings of the ACL-08: HLT Demo Session, Columbus, pp. 9–12 (2008)

    Google Scholar 

  57. Vieira, R., Poesio, M.: An empirically-based system for processing definite descriptions. Comput. Linguist. 26 (4), 539–593 (2000)

    Article  Google Scholar 

  58. Vilain, M., Burger, J., Aberdeen, J., Connolly, D., Hirschman, L.: A model-theoretic coreference scoring scheme. In: Proceedings of the Sixth Message Understanding Conference, pp. 45–52. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  59. Webber, B.L.: A Formal Approach to Discourse Anaphora. Garland Publishing, Inc., New York (1979)

    Google Scholar 

  60. Yang, X., Su, J., Lang, J., Tan, C.L., Li, S.: An entity-mention model for coreference resolution with inductive logic programming. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Columbus, pp. 843–851 (2008)

    Google Scholar 

  61. Yang, X., Su, J., Tan, C.L.: Improving pronoun resolution using statistics-based semantic compatibility information. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, University of Michigan, pp. 165–172 (2005)

    Google Scholar 

  62. Yang, X., Su, J., Tan, C.L.: A twin-candidate model for learning-based anaphora resolution. Comput. Linguist. 34 (3), 327–356 (2008)

    Article  Google Scholar 

  63. Yang, X., Su, J., Zhou, G., Tan, C.L.: An NP-cluster based approach to coreference resolution. In: Proceedings of the 20th International Conference on Computational Linguistics, Geneva, pp. 226–232 (2004)

    Google Scholar 

  64. Yang, X., Zhou, G., Su, J., Tan, C.L.: Coreference resolution using competitive learning approach. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, Sapporo, pp. 176–183 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vincent Ng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Ng, V. (2016). Advanced Machine Learning Models for Coreference Resolution. In: Poesio, M., Stuckardt, R., Versley, Y. (eds) Anaphora Resolution. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-47909-4_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-47909-4_10

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-47908-7

  • Online ISBN: 978-3-662-47909-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics