Skip to main content

Off-the-Shelf Tools

  • Chapter
  • First Online:
  • 1092 Accesses

Abstract

Off-the-shelf coreference tools can be a useful ingredient for downstream applications in machine translation, information extraction or sentiment recognition. In this chapter, we will present the properties that are most important for the integration of coreference systems into a larger context, then describe the BART system, the dCoref system that is part of Stanford’s CoreNLP suite, as well as IMSCoref and HOTCoref as examples of state-of-the-art systems that are purely based on machine learning. We finish the chapter by outlining a checklist-based approach on choosing, integrating and adapting a coreference system for a putative new application context.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    In the context of this discussion, standardized amounts to being described well enough that it is possible to write interoperable programs that solve edge cases in the same way, and that it is possible to get stakeholders to agree on one particular interpretation of that description. Formal endorsement by a government or standards body are not relevant to the descriptions in this chapter, although these do improve the feasibility for institutional users to buy or commission such components.

References

  1. Athar, A., Teufel, S.: Context-enhanced citation sentiment detection. In: Proceedings of the 2012 Conference of the NAACL: HLT, Association for Computational Linguistics, Montréal, pp. 597–601 (2012). http://www.aclweb.org/anthology/N12-1073

  2. Bernaola Biggio, S.M., Giuliano, C., Poesio, M., Versley, Y., Uryupina, O., Zanoli, R.: Local entity detection and recognition task. In: Proceedings of Evalita-2009, Reggio Emilia (2009)

    Google Scholar 

  3. Berndtsson, J.: Coreference resolution in BART: essay assignment for Semantic Analysis in Language Technology. http://stp.lingfil.uu.se/~santinim/sais/Ass1_Essays_FinalVersion/Berntsson_Jakob_essay_final.pdf (2014)

  4. Björkelund, A., Farkas, R.: Data-driven multilingual coreference resolution using resolver stacking. In: Joint Conference on EMNLP and CoNLL – Shared Task, Jeju Island, pp. 49–55. Association for Computational Linguistics (2012). http://www.aclweb.org/anthology/W12-4503

  5. Björkelund, A., Kuhn, J.: Phrase structures and dependencies for end-to-end coreference resolution. In: Proceedings of COLING 2012: Posters, The COLING 2012 Organizing Committee, Mumbai, pp. 145–154 (2012). http://www.aclweb.org/anthology/C12-2015

  6. Björkelund, A., Kuhn, J.: Learning structured perceptrons for coreference resolution with latent antecedents and non-local features. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Baltimore, pp. 47–57 (2014). http://www.aclweb.org/anthology/P14-1005

  7. Broscheit, S., Poesio, M., POnzetto, S., Rodriguez, K.J., Romano, L., Uryupina, O., Versley, Y., Zanoli, R.: BART: A multilingual anaphora resolution system. In: Proceedings of SemEval-2010, Uppsala (2010)

    Google Scholar 

  8. Broscheit, S., Ponzetto, S.P., Versley, Y., Poesio, M.: Extending BART to provide a coreference resolution system for German. In: Proceedings of the 7th International Conference on Language Resources and Evaluation, Valletta (2010)

    Google Scholar 

  9. Cai, J., Strube, M.: End-to-end coreference resolution via hypergraph partitioning. In: Proceedings of Coling 2010, Beijing (2010)

    Google Scholar 

  10. Cai, J., Mujdricza-Maydt, E., Strube, M.: Unrestricted coreference resolution via global hypergraph partitioning. In: Proceedings of the 15th Conference on Computational Natural Language Learning: Shared Task, Portland (2011)

    Google Scholar 

  11. Chang, K.W., Samdani, R., Rozovskaya, A., Sammons, M., Roth, D.: Illinois-coref: the ui system in the conll-2012 shared task. In: Joint Conference on EMNLP and CoNLL – Shared Task, pp. 113–117. Association for Computational Linguistics, Jeju Island (2012). http://www.aclweb.org/anthology/W12-4513

  12. Chang, K.W., Samdani, R., Roth, D.: A constrained latent variable model for coreference resolution. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 601–612. Association for Computational Linguistics, Seattle (2013). http://www.aclweb.org/anthology/D13-1057

  13. Charniak, E., Johnson, M.: Coarse-to-fine n-best parsing and maxent discriminative reranking. In: Proceedings of the ACL 2005, Ann Arbor (2005)

    Google Scholar 

  14. Collins, M.: Discriminative training methods for hidden markov models: theory and experiments with perceptron algorithms. In: Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, pp. 1–8. Association for Computational Linguistics (2002). doi:10.3115/1118693.1118694. http://www.aclweb.org/anthology/W02-1001

  15. Culotta, A., Wick, M., McCallum, A.: First-order probabilistic models for coreference resolution. In: Proceedings of the HLT/NAACL 2007, Rochester (2007)

    Google Scholar 

  16. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evolut. Comput. 6 (2), 181–197 (2002)

    Article  Google Scholar 

  17. Durrett, G., Klein, D.: Easy victories and uphill battles in coreference resolution. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1971–1982. Association for Computational Linguistics, Seattle (2013). http://www.aclweb.org/anthology/D13-1203

  18. Elsner, M.: Character-based kernels for novelistic plot structure. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 634–644. Association for Computational Linguistics, Avignon (2012). http://www.aclweb.org/anthology/E12-1065

  19. Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)

    MATH  Google Scholar 

  20. Fernandes, E., dos Santos, C., Milidiú, R.: Latent structure perceptron with feature induction for unrestricted coreference resolution. In: Joint Conference on EMNLP and CoNLL – Shared Task, pp. 41–48. Association for Computational Linguistics, Jeju Island (2012). http://www.aclweb.org/anthology/W12-4502

  21. Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, University of Michigan (2005)

    Google Scholar 

  22. Foster, J., Cetinooglu, O., Wagner, J., Le Roux, J., Nivre, J., Hogan, D., van Genabith, J.: From news to comment: resources and benchmarks for parsing the language of Web 2.0. In: Proceedings of IJCNLP, Chiang Mai (2011)

    Google Scholar 

  23. Garrido, G., Cabaleiro, B., Penas, A., Rodrigo, A., Spina, D.: A distant supervised learning system for the tac-kbp slot filling and temporal slot filling tasks. In: Proceedings of Text Analysis Conference (TAC), Gaithersburg (2011)

    Google Scholar 

  24. Giesbrecht, E., Evert, S.: Part-of-speech tagging – a solved task? An evaluation of POS taggers for the Web as corpus. In: Proceedings of the 5th Web as Corpus Workshop (WaC 5), San Sebastian (2009)

    Google Scholar 

  25. Hardmeier, C.: Discourse in statistical machine translation: a survey and a case study. Discours 11 (2012). [online]. doi:10.4000/discours.8726

    Google Scholar 

  26. Hardmeier, C., Federico, M.: Modelling pronominal anaphora in statistical machine translation. In: Proceedings of the 7th International Workshop on Spoken Language Translation (IWSLT 2010), Paris (2010)

    Google Scholar 

  27. Klein, D., Manning, C.D.: Fast exact inference with a factored model for natural language parsing. In: NIPS 2002, Vancouver (2003)

    Google Scholar 

  28. Kobdani, H., Schütze, H.: Supervised coreference resolution with SUCRE. In: Proceedings of the 15th Conference on Natural Language Learning: Shared Task, Portland, pp. 71–75 (2011)

    Google Scholar 

  29. Kopeć, M., Ogrodniczuk, M.: Creating a coreference resolution system for polish. In: Proceedings of LREC 2010, Valletta (2010)

    Google Scholar 

  30. Kunze, C., Lemnitzer, L.: GermaNet – representation, visualization, application. In: Proceedings of LREC 2002, Las Palmas (2002)

    Google Scholar 

  31. Lee, H., Chang, A., Peirsman, Y., Chambers, N., Surdeanu, M., Jurafsky, D.: Deterministic coreference resolution based on entity-centric, precision-ranked rules. Comput. Linguist. 39 (4), 885–916 (2013)

    Article  Google Scholar 

  32. Markert, K., Nissim, M.: Comparing knowledge sources for nominal anaphora resolution. Comput. Linguist. 31 (3), 367–402 (2005)

    Article  Google Scholar 

  33. Martschat, S.: Multigraph clustering for unsupervised coreference resolution. In: Proceedings of the ACL Student Research Workshop, Sofia (2013)

    Google Scholar 

  34. Martschat, S., Cai, J., Broscheit, S., Mujdricza-Maydt, E., Strube, M.: A multigraph model for coreference resolution. In: Proceedings of the Shared Task of the 16th Conference on Computational Natural Language Learning, Jeju Island (2012)

    Google Scholar 

  35. Minnen, G., Caroll, J., Pearce, D.: Applied morphological processing of English. Nat. Lang. Eng. 7 (3), 207–223 (2001)

    Article  Google Scholar 

  36. Morton, T.S.: Coreference for NLP Applications. In: Proceedings of the 38th Meeting of the Association for Computational Linguistics, Hong Kong (2000). http://aclweb.org/anthology-new/P/P00/P00-1023.pdf

  37. Müller, C., Strube, M.: Multi-level annotation of linguistic data with MMAX2. In: Braun, S., Kohn, K., Mukherjee, J. (eds.) Corpus Technology and Language Pedagogy: New Resources, New Tools, New Methods, Peter Lang, Frankfurt a,M. (2006)

    Google Scholar 

  38. Ng, V., Cardie, C.: Improving machine learning approaches to coreference resolution. In: Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, pp. 104–111. Association for Computational Linguistics, Philadelphia (2002). doi:10.3115/1073083.1073102. http://www.aclweb.org/anthology/P02-1014

  39. Petrov, S., Barett, L., Thibaux, R., Klein, D.: Learning accurate, compact, and interpretable tree annotation. In: COLING-ACL 2006, Sydney (2006)

    Google Scholar 

  40. Poesio, M., Kabadjov, M.A.: A general-purpose, off-the-shelf anaphora resolution module: implementation and preliminary evaluation. In: LREC’2004, Lisbon (2004)

    Google Scholar 

  41. Poesio, M., Mehta, R., Maroudas, A., Hitzeman, J.: Learning to resolve bridging references. In: ACL-2004 (2004). http://cswww.essex.ac.uk/staff/poesio/publications/ACL04.pdf

  42. Poesio, M., Uryupina, O., Versley, Y.: Creating a coreference resolution system for italian. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), Valletta (2010)

    Google Scholar 

  43. Ponzetto, S.P., Strube, M.: Exploiting semantic role labeling, WordNet and Wikipedia for coreference resolution. In: Proceedings of HLT/NAACL 2006, New York (2006)

    Google Scholar 

  44. Pradhan, S., Moschitti, A., Xue, N., Uryupina, O., Zhang, Y.: Conll-2012 shared task: modeling multilingual unrestricted coreference in ontonotes. In: Joint Conference on EMNLP and CoNLL – Shared Task, pp. 1–40. Association for Computational Linguistics, Jeju Island (2012). http://www.aclweb.org/anthology/W12-4501

  45. Qiu, L., Kan, M.Y., Chua, T.S.: A public reference implementation of the RAP anaphora resolution algorithm. In: Proceedings of LREC 2004, Lisbon (2004)

    Google Scholar 

  46. Recasens, M., Can, M., Jurafsky, D.: Same referent, different words: unsupervised mining of opaque coreferent mentions. In: Proceedings of NAACL-HLT 2013, Atlanta (2013)

    Google Scholar 

  47. Recasens, M., de Marneffe, M.C., Potts, C.: The life and death of discourse entities: identifying singleton mentions. In: Proceedings of HLT-NAACL 2013, Atlanta (2013)

    Google Scholar 

  48. Reiter, N., Hellwig, O., Mishra, A., Gossmann, I., Larios, B.M., Rodrigues, J., Zeller, B., Frank, A.: Adapting standard NLP tools and resources to the processing of ritual descriptions. In: Proceedings of the ECAI 2010 Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH), Lisbon (2010)

    Google Scholar 

  49. Sikdar, U.K., Ekbal, A., Saha, S., Uryupina, O., Poesio, M.: Differential evolution-based feature selection technique for anaphora resolution. Soft Comput. 19 (8), 2149–2161 (2015)

    Article  Google Scholar 

  50. Soon, W.M., Ng, H.T., Lim, D.C.Y.: A machine learning approach to coreference resolution of noun phrases. Comput. Linguist. 27 (4), 521–544 (2001). http://acl.eldoc.ub.rug.nl/mirror/J/J01/J01-4004.pdf

    Article  Google Scholar 

  51. Telljohann, H., Hinrichs, E.W., Kübler, S., Zinsmeister, H., Beck, K.: Stylebook for the Tübingen Treebank of Written German (TüBa-D/Z). Tech. rep., Seminar für Sprachwissenschaft, Universität Tübingen (2009)

    Google Scholar 

  52. Uryupina, O., Saha, S., Ekbal, A., Poesio, M.: Multi-metric optimization for coreference: the unitn / iitp / essex submission to the CoNLL shared task. In: Proceedings of CoNLL-2011, Portland (2011)

    Google Scholar 

  53. Uryupina, O., Moschitti, A., Poesio, M.: BART goes multilingual: the UniTN/Essex submission to the CoNLL-2012 shared task. In: Proceedings of the Joint Conference on EMNLP and CoNLL: Shared Task, Jeju Island (2012)

    Google Scholar 

  54. Vadlapudi, R.: Verbose labels for semantic roles. Master’s thesis, Simon Fraser University (2013)

    Google Scholar 

  55. Versley, Y.: A constraint-based approach to noun phrase coreference resolution in German newspaper text. In: Konferenz zur Verarbeitung Natürlicher Sprache (KONVENS 2006), Konstanz (2006)

    Google Scholar 

  56. Versley, Y.: Antecedent selection techniques for high-recall coreference resolution. In: EMNLP 2007, Prague (2007)

    Google Scholar 

  57. Versley, Y., Moschitti, A., Poesio, M., Yang, X.: Coreference systems based on kernel methods. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Manchster (2008)

    Google Scholar 

  58. Versley, Y., Ponzetto, S., Poesio, M., Eidelman, V., Jern, A., Smith, J., Yang, X., Moschitti, A.: BART: a modular toolkit for coreference resolution. In: ACL 2008 System Demonstrations, Baltimore (2008)

    Google Scholar 

  59. Versley, Y., Beck, A.K., Hinrichs, E., Telljohann, H.: A syntax-first approach to high-quality morphological analysis and lemma disambiguation for the TüBa-D/Z treebank. In: Proceedings of the 9th Conference on Treebanks and Linguistic Theories (TLT9), Tartu (2010)

    Google Scholar 

  60. Wang, R., Zhang, Y., Neumann, G.: A joint syntactic-semantic representation for recognizing textual relatedness. In: Text Analysis Conference TAC 2009 Notebook Papers and Results, Gaithersburg (2009)

    Google Scholar 

  61. Wellner, B., Vilain, M.: Leveraging machine readable dictionaries in discriminative sequence models. In: Proceedings of LREC 2006, Genoa (2006)

    Google Scholar 

  62. Yang, X., Su, J., Tan, C.L.: Kernel-based pronoun resolution with structured syntactic knowledge. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, ACL-44, pp. 41–48 (2006). doi:10.3115/1220175.1220181. http://dx.doi.org/10.3115/1220175.1220181

  63. Yang, X., Su, J., Tan, C.L.: Kernel-based pronoun resolution with structured syntactic knowledge. In: Proceedings of CoLing/ACL-2006 (2006). http://www.aclweb.org/anthology/P/P06/P06-1006

  64. Young, P., Lai, A., Hodosh, M., Hockenmaier, J.: From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. Trans. Assoc. Comput. Linguist. 3, 67–78 (2014)

    Google Scholar 

  65. Zhao, S., Ng, H.T.: Maximum metric score training for coreference resolution. In: Proceedings of Coling 2010, Beijing (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yannick Versley .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Versley, Y., Björkelund, A. (2016). Off-the-Shelf Tools. In: Poesio, M., Stuckardt, R., Versley, Y. (eds) Anaphora Resolution. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-47909-4_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-47909-4_8

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-47908-7

  • Online ISBN: 978-3-662-47909-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics