Named Entity Recognition and Resolution in Legal Text

Dozier, Christopher; Kondadadi, Ravikumar; Light, Marc; Vachher, Arun; Veeramachaneni, Sriharsha; Wudali, Ramdev

doi:10.1007/978-3-642-12837-0_2

Christopher Dozier²²,
Ravikumar Kondadadi²²,
Marc Light²²,
Arun Vachher²²,
Sriharsha Veeramachaneni²² &
…
Ramdev Wudali²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6036))

2592 Accesses
38 Citations

Abstract

Named entities in text are persons, places, companies, etc. that are explicitly mentioned in text using proper nouns. The process of finding named entities in a text and classifying them to a semantic type, is called named entity recognition. Resolution of named entities is the process of linking a mention of a name in text to a pre-existing database entry. This grounds the mention in something analogous to a real world entity. For example, a mention of a judge named Mary Smith might be resolved to a database entry for a specific judge of a specific district of a specific state. This recognition and resolution of named entities can be leveraged in a number of ways including providing hypertext links to information stored about a particular judge: their education, who appointed them, their other case opinions, etc.

This paper discusses named entity recognition and resolution in legal documents such as US case law, depositions, and pleadings and other trial documents. The types of entities include judges, attorneys, companies, jurisdictions, and courts.

We outline three methods for named entity recognition, lookup, context rules, and statistical models. We then describe an actual system for finding named entities in legal text and evaluate its accuracy. Similarly, for resolution, we discuss our blocking techniques, our resolution features, and the supervised and semi-supervised machine learning techniques we employ for the final matching.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dozier, C., Haschart, R.: Automatic Extraction and Linking of Person Names in Legal Text. In: Proceedings of RIAO 2000, Recherche d’Information Assistee par Ordinateur, Paris, France, April 12-14, pp. 1305–1321 (2000)
Google Scholar
Dozier, C., Zielund, T.: Cross Document Co-Reference Resolution Applications for People in the Legal Domain. In: Proceedings of the ACL 2004 Workshop on Reference Resolution and its Applications, Barcelona, Spain, July 25-26, pp. 9–16 (2004)
Google Scholar
Chaudhary, M., Dozier, C., Atkinson, G., Berosik, G., Guo, X., Samler, S.: Mining Legal Text to Create a Litigation History Database. In: Proceedings of IASTED International Conference on Law and Technology, Cambridge, MA, USA (2006)
Google Scholar
Quaresma, P., Gonçalves, T.: Using Linguistic Information and Machine Learning Techniques to Identify Entities from Juridical Documents. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds.) Semantic Processing of Legal Texts. LNCS (LNAI), vol. 6036, pp. 44–59. Springer, Heidelberg (2010)
Google Scholar
Cohen, W., Ravikumar, P., Fienberg, S.: A Comparison of String Distance Metrics for Name-matching Tasks. In: Proc. II Web Workshop IJCAI, pp. 73–78 (2003)
Google Scholar
Dozier, C., Veeramachaneni, S.: Names, Fame, and Co-Reference Resolution, Thomson Reuters Research and Development. Technical Report (2009)
Google Scholar
Liao, W., Light, M., Veeramachaneni, S.: Integrating High Precision Rules with Statistical Sequence Classifiers for Accuracy and Speed. In: Proceedings of the NAACL 2009 Workshop Software engineering, testing, and quality assurance for Natural Language Processing (2009)
Google Scholar
Yeh, A., Morgan, A., Colosimo, M., Hirschman, L.: BioCreative task 1A: Gene mention finding evaluation. BMC Bioinformatics 6(Suppl. 1) (2005)
Google Scholar
Grishman, R., Sundheim, B.: Message Understanding Conference - 6: A Brief History. In: Proceedings of the 16th International Conference on Computational Linguistics (COLING), I, Kopenhagen (1996)
Google Scholar
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. of ICML (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Thomson Reuters Research and Development, Eagan, MN 55123, USA
Christopher Dozier, Ravikumar Kondadadi, Marc Light, Arun Vachher, Sriharsha Veeramachaneni & Ramdev Wudali

Authors

Christopher Dozier
View author publications
You can also search for this author in PubMed Google Scholar
Ravikumar Kondadadi
View author publications
You can also search for this author in PubMed Google Scholar
Marc Light
View author publications
You can also search for this author in PubMed Google Scholar
Arun Vachher
View author publications
You can also search for this author in PubMed Google Scholar
Sriharsha Veeramachaneni
View author publications
You can also search for this author in PubMed Google Scholar
Ramdev Wudali
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Legal Information, Theory and Techniques, ITTIG-CNR, Via dei Barucci 20, 50127, Florence, Italy
Enrico Francesconi & Daniela Tiscornia &
Istituto di Linguistica Computazionale "Antonio Zampolli" (ILC) - CNR, Area della Ricerca di Pisa,, Via Moruzzi 1, 56124, Pisa, Italy
Simonetta Montemagni
Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, S1 4DP, Sheffield, UK
Wim Peters

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Dozier, C., Kondadadi, R., Light, M., Vachher, A., Veeramachaneni, S., Wudali, R. (2010). Named Entity Recognition and Resolution in Legal Text. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds) Semantic Processing of Legal Texts. Lecture Notes in Computer Science(), vol 6036. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12837-0_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-12837-0_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12836-3
Online ISBN: 978-3-642-12837-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics