Off-the-Shelf Tools

Versley, Yannick; Björkelund, Anders

doi:10.1007/978-3-662-47909-4_8

Off-the-Shelf Tools

Yannick Versley⁷ &
Anders Björkelund⁸

Chapter
First Online: 05 August 2016

1092 Accesses

Part of the book series: Theory and Applications of Natural Language Processing ((NLP))

Abstract

Off-the-shelf coreference tools can be a useful ingredient for downstream applications in machine translation, information extraction or sentiment recognition. In this chapter, we will present the properties that are most important for the integration of coreference systems into a larger context, then describe the BART system, the dCoref system that is part of Stanford’s CoreNLP suite, as well as IMSCoref and HOTCoref as examples of state-of-the-art systems that are purely based on machine learning. We finish the chapter by outlining a checklist-based approach on choosing, integrating and adapting a coreference system for a putative new application context.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
In the context of this discussion, standardized amounts to being described well enough that it is possible to write interoperable programs that solve edge cases in the same way, and that it is possible to get stakeholders to agree on one particular interpretation of that description. Formal endorsement by a government or standards body are not relevant to the descriptions in this chapter, although these do improve the feasibility for institutional users to buy or commission such components.

References

Athar, A., Teufel, S.: Context-enhanced citation sentiment detection. In: Proceedings of the 2012 Conference of the NAACL: HLT, Association for Computational Linguistics, Montréal, pp. 597–601 (2012). http://www.aclweb.org/anthology/N12-1073
Bernaola Biggio, S.M., Giuliano, C., Poesio, M., Versley, Y., Uryupina, O., Zanoli, R.: Local entity detection and recognition task. In: Proceedings of Evalita-2009, Reggio Emilia (2009)
Google Scholar
Berndtsson, J.: Coreference resolution in BART: essay assignment for Semantic Analysis in Language Technology. http://stp.lingfil.uu.se/~santinim/sais/Ass1_Essays_FinalVersion/Berntsson_Jakob_essay_final.pdf (2014)
Björkelund, A., Farkas, R.: Data-driven multilingual coreference resolution using resolver stacking. In: Joint Conference on EMNLP and CoNLL – Shared Task, Jeju Island, pp. 49–55. Association for Computational Linguistics (2012). http://www.aclweb.org/anthology/W12-4503
Björkelund, A., Kuhn, J.: Phrase structures and dependencies for end-to-end coreference resolution. In: Proceedings of COLING 2012: Posters, The COLING 2012 Organizing Committee, Mumbai, pp. 145–154 (2012). http://www.aclweb.org/anthology/C12-2015
Björkelund, A., Kuhn, J.: Learning structured perceptrons for coreference resolution with latent antecedents and non-local features. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Baltimore, pp. 47–57 (2014). http://www.aclweb.org/anthology/P14-1005
Broscheit, S., Poesio, M., POnzetto, S., Rodriguez, K.J., Romano, L., Uryupina, O., Versley, Y., Zanoli, R.: BART: A multilingual anaphora resolution system. In: Proceedings of SemEval-2010, Uppsala (2010)
Google Scholar
Broscheit, S., Ponzetto, S.P., Versley, Y., Poesio, M.: Extending BART to provide a coreference resolution system for German. In: Proceedings of the 7th International Conference on Language Resources and Evaluation, Valletta (2010)
Google Scholar
Cai, J., Strube, M.: End-to-end coreference resolution via hypergraph partitioning. In: Proceedings of Coling 2010, Beijing (2010)
Google Scholar
Cai, J., Mujdricza-Maydt, E., Strube, M.: Unrestricted coreference resolution via global hypergraph partitioning. In: Proceedings of the 15th Conference on Computational Natural Language Learning: Shared Task, Portland (2011)
Google Scholar
Chang, K.W., Samdani, R., Rozovskaya, A., Sammons, M., Roth, D.: Illinois-coref: the ui system in the conll-2012 shared task. In: Joint Conference on EMNLP and CoNLL – Shared Task, pp. 113–117. Association for Computational Linguistics, Jeju Island (2012). http://www.aclweb.org/anthology/W12-4513
Chang, K.W., Samdani, R., Roth, D.: A constrained latent variable model for coreference resolution. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 601–612. Association for Computational Linguistics, Seattle (2013). http://www.aclweb.org/anthology/D13-1057
Charniak, E., Johnson, M.: Coarse-to-fine n-best parsing and maxent discriminative reranking. In: Proceedings of the ACL 2005, Ann Arbor (2005)
Google Scholar
Collins, M.: Discriminative training methods for hidden markov models: theory and experiments with perceptron algorithms. In: Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, pp. 1–8. Association for Computational Linguistics (2002). doi:10.3115/1118693.1118694. http://www.aclweb.org/anthology/W02-1001
Culotta, A., Wick, M., McCallum, A.: First-order probabilistic models for coreference resolution. In: Proceedings of the HLT/NAACL 2007, Rochester (2007)
Google Scholar
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evolut. Comput. 6 (2), 181–197 (2002)
Article Google Scholar
Durrett, G., Klein, D.: Easy victories and uphill battles in coreference resolution. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1971–1982. Association for Computational Linguistics, Seattle (2013). http://www.aclweb.org/anthology/D13-1203
Elsner, M.: Character-based kernels for novelistic plot structure. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 634–644. Association for Computational Linguistics, Avignon (2012). http://www.aclweb.org/anthology/E12-1065
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
MATH Google Scholar
Fernandes, E., dos Santos, C., Milidiú, R.: Latent structure perceptron with feature induction for unrestricted coreference resolution. In: Joint Conference on EMNLP and CoNLL – Shared Task, pp. 41–48. Association for Computational Linguistics, Jeju Island (2012). http://www.aclweb.org/anthology/W12-4502
Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, University of Michigan (2005)
Google Scholar
Foster, J., Cetinooglu, O., Wagner, J., Le Roux, J., Nivre, J., Hogan, D., van Genabith, J.: From news to comment: resources and benchmarks for parsing the language of Web 2.0. In: Proceedings of IJCNLP, Chiang Mai (2011)
Google Scholar
Garrido, G., Cabaleiro, B., Penas, A., Rodrigo, A., Spina, D.: A distant supervised learning system for the tac-kbp slot filling and temporal slot filling tasks. In: Proceedings of Text Analysis Conference (TAC), Gaithersburg (2011)
Google Scholar
Giesbrecht, E., Evert, S.: Part-of-speech tagging – a solved task? An evaluation of POS taggers for the Web as corpus. In: Proceedings of the 5th Web as Corpus Workshop (WaC 5), San Sebastian (2009)
Google Scholar
Hardmeier, C.: Discourse in statistical machine translation: a survey and a case study. Discours 11 (2012). [online]. doi:10.4000/discours.8726
Google Scholar
Hardmeier, C., Federico, M.: Modelling pronominal anaphora in statistical machine translation. In: Proceedings of the 7th International Workshop on Spoken Language Translation (IWSLT 2010), Paris (2010)
Google Scholar
Klein, D., Manning, C.D.: Fast exact inference with a factored model for natural language parsing. In: NIPS 2002, Vancouver (2003)
Google Scholar
Kobdani, H., Schütze, H.: Supervised coreference resolution with SUCRE. In: Proceedings of the 15th Conference on Natural Language Learning: Shared Task, Portland, pp. 71–75 (2011)
Google Scholar
Kopeć, M., Ogrodniczuk, M.: Creating a coreference resolution system for polish. In: Proceedings of LREC 2010, Valletta (2010)
Google Scholar
Kunze, C., Lemnitzer, L.: GermaNet – representation, visualization, application. In: Proceedings of LREC 2002, Las Palmas (2002)
Google Scholar
Lee, H., Chang, A., Peirsman, Y., Chambers, N., Surdeanu, M., Jurafsky, D.: Deterministic coreference resolution based on entity-centric, precision-ranked rules. Comput. Linguist. 39 (4), 885–916 (2013)
Article Google Scholar
Markert, K., Nissim, M.: Comparing knowledge sources for nominal anaphora resolution. Comput. Linguist. 31 (3), 367–402 (2005)
Article Google Scholar
Martschat, S.: Multigraph clustering for unsupervised coreference resolution. In: Proceedings of the ACL Student Research Workshop, Sofia (2013)
Google Scholar
Martschat, S., Cai, J., Broscheit, S., Mujdricza-Maydt, E., Strube, M.: A multigraph model for coreference resolution. In: Proceedings of the Shared Task of the 16th Conference on Computational Natural Language Learning, Jeju Island (2012)
Google Scholar
Minnen, G., Caroll, J., Pearce, D.: Applied morphological processing of English. Nat. Lang. Eng. 7 (3), 207–223 (2001)
Article Google Scholar
Morton, T.S.: Coreference for NLP Applications. In: Proceedings of the 38th Meeting of the Association for Computational Linguistics, Hong Kong (2000). http://aclweb.org/anthology-new/P/P00/P00-1023.pdf
Müller, C., Strube, M.: Multi-level annotation of linguistic data with MMAX2. In: Braun, S., Kohn, K., Mukherjee, J. (eds.) Corpus Technology and Language Pedagogy: New Resources, New Tools, New Methods, Peter Lang, Frankfurt a,M. (2006)
Google Scholar
Ng, V., Cardie, C.: Improving machine learning approaches to coreference resolution. In: Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, pp. 104–111. Association for Computational Linguistics, Philadelphia (2002). doi:10.3115/1073083.1073102. http://www.aclweb.org/anthology/P02-1014
Petrov, S., Barett, L., Thibaux, R., Klein, D.: Learning accurate, compact, and interpretable tree annotation. In: COLING-ACL 2006, Sydney (2006)
Google Scholar
Poesio, M., Kabadjov, M.A.: A general-purpose, off-the-shelf anaphora resolution module: implementation and preliminary evaluation. In: LREC’2004, Lisbon (2004)
Google Scholar
Poesio, M., Mehta, R., Maroudas, A., Hitzeman, J.: Learning to resolve bridging references. In: ACL-2004 (2004). http://cswww.essex.ac.uk/staff/poesio/publications/ACL04.pdf
Poesio, M., Uryupina, O., Versley, Y.: Creating a coreference resolution system for italian. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), Valletta (2010)
Google Scholar
Ponzetto, S.P., Strube, M.: Exploiting semantic role labeling, WordNet and Wikipedia for coreference resolution. In: Proceedings of HLT/NAACL 2006, New York (2006)
Google Scholar
Pradhan, S., Moschitti, A., Xue, N., Uryupina, O., Zhang, Y.: Conll-2012 shared task: modeling multilingual unrestricted coreference in ontonotes. In: Joint Conference on EMNLP and CoNLL – Shared Task, pp. 1–40. Association for Computational Linguistics, Jeju Island (2012). http://www.aclweb.org/anthology/W12-4501
Qiu, L., Kan, M.Y., Chua, T.S.: A public reference implementation of the RAP anaphora resolution algorithm. In: Proceedings of LREC 2004, Lisbon (2004)
Google Scholar
Recasens, M., Can, M., Jurafsky, D.: Same referent, different words: unsupervised mining of opaque coreferent mentions. In: Proceedings of NAACL-HLT 2013, Atlanta (2013)
Google Scholar
Recasens, M., de Marneffe, M.C., Potts, C.: The life and death of discourse entities: identifying singleton mentions. In: Proceedings of HLT-NAACL 2013, Atlanta (2013)
Google Scholar
Reiter, N., Hellwig, O., Mishra, A., Gossmann, I., Larios, B.M., Rodrigues, J., Zeller, B., Frank, A.: Adapting standard NLP tools and resources to the processing of ritual descriptions. In: Proceedings of the ECAI 2010 Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH), Lisbon (2010)
Google Scholar
Sikdar, U.K., Ekbal, A., Saha, S., Uryupina, O., Poesio, M.: Differential evolution-based feature selection technique for anaphora resolution. Soft Comput. 19 (8), 2149–2161 (2015)
Article Google Scholar
Soon, W.M., Ng, H.T., Lim, D.C.Y.: A machine learning approach to coreference resolution of noun phrases. Comput. Linguist. 27 (4), 521–544 (2001). http://acl.eldoc.ub.rug.nl/mirror/J/J01/J01-4004.pdf
Article Google Scholar
Telljohann, H., Hinrichs, E.W., Kübler, S., Zinsmeister, H., Beck, K.: Stylebook for the Tübingen Treebank of Written German (TüBa-D/Z). Tech. rep., Seminar für Sprachwissenschaft, Universität Tübingen (2009)
Google Scholar
Uryupina, O., Saha, S., Ekbal, A., Poesio, M.: Multi-metric optimization for coreference: the unitn / iitp / essex submission to the CoNLL shared task. In: Proceedings of CoNLL-2011, Portland (2011)
Google Scholar
Uryupina, O., Moschitti, A., Poesio, M.: BART goes multilingual: the UniTN/Essex submission to the CoNLL-2012 shared task. In: Proceedings of the Joint Conference on EMNLP and CoNLL: Shared Task, Jeju Island (2012)
Google Scholar
Vadlapudi, R.: Verbose labels for semantic roles. Master’s thesis, Simon Fraser University (2013)
Google Scholar
Versley, Y.: A constraint-based approach to noun phrase coreference resolution in German newspaper text. In: Konferenz zur Verarbeitung Natürlicher Sprache (KONVENS 2006), Konstanz (2006)
Google Scholar
Versley, Y.: Antecedent selection techniques for high-recall coreference resolution. In: EMNLP 2007, Prague (2007)
Google Scholar
Versley, Y., Moschitti, A., Poesio, M., Yang, X.: Coreference systems based on kernel methods. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Manchster (2008)
Google Scholar
Versley, Y., Ponzetto, S., Poesio, M., Eidelman, V., Jern, A., Smith, J., Yang, X., Moschitti, A.: BART: a modular toolkit for coreference resolution. In: ACL 2008 System Demonstrations, Baltimore (2008)
Google Scholar
Versley, Y., Beck, A.K., Hinrichs, E., Telljohann, H.: A syntax-first approach to high-quality morphological analysis and lemma disambiguation for the TüBa-D/Z treebank. In: Proceedings of the 9th Conference on Treebanks and Linguistic Theories (TLT9), Tartu (2010)
Google Scholar
Wang, R., Zhang, Y., Neumann, G.: A joint syntactic-semantic representation for recognizing textual relatedness. In: Text Analysis Conference TAC 2009 Notebook Papers and Results, Gaithersburg (2009)
Google Scholar
Wellner, B., Vilain, M.: Leveraging machine readable dictionaries in discriminative sequence models. In: Proceedings of LREC 2006, Genoa (2006)
Google Scholar
Yang, X., Su, J., Tan, C.L.: Kernel-based pronoun resolution with structured syntactic knowledge. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, ACL-44, pp. 41–48 (2006). doi:10.3115/1220175.1220181. http://dx.doi.org/10.3115/1220175.1220181
Yang, X., Su, J., Tan, C.L.: Kernel-based pronoun resolution with structured syntactic knowledge. In: Proceedings of CoLing/ACL-2006 (2006). http://www.aclweb.org/anthology/P/P06/P06-1006
Young, P., Lai, A., Hodosh, M., Hockenmaier, J.: From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. Trans. Assoc. Comput. Linguist. 3, 67–78 (2014)
Google Scholar
Zhao, S., Ng, H.T.: Maximum metric score training for coreference resolution. In: Proceedings of Coling 2010, Beijing (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

ICL Universität Heidelberg, Heidelberg, Germany
Yannick Versley
IMS Universität Stuttgart, Stuttgart, Germany
Anders Björkelund

Authors

Yannick Versley
View author publications
You can also search for this author in PubMed Google Scholar
Anders Björkelund
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yannick Versley .

Editor information

Editors and Affiliations

Trento, Italy
Massimo Poesio
Frankfurt am Main, Germany
Roland Stuckardt
Heidelberg, Germany
Yannick Versley

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Versley, Y., Björkelund, A. (2016). Off-the-Shelf Tools. In: Poesio, M., Stuckardt, R., Versley, Y. (eds) Anaphora Resolution. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-47909-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-662-47909-4_8
Published: 05 August 2016
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-47908-7
Online ISBN: 978-3-662-47909-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics