Automatic semantic edge labeling over legal citation graphs

Sadeghian, Ali; Sundaram, Laksshman; Wang, Daisy Zhe; Hamilton, William F.; Branting, Karl; Pfeifer, Craig

doi:10.1007/s10506-018-9217-1

Automatic semantic edge labeling over legal citation graphs

Published: 01 March 2018

Volume 26, pages 127–144, (2018)
Cite this article

Artificial Intelligence and Law Aims and scope Submit manuscript

Ali Sadeghian¹,
Laksshman Sundaram^1,2,
Daisy Zhe Wang¹,
William F. Hamilton¹,
Karl Branting³ &
…
Craig Pfeifer³

1204 Accesses
14 Citations
7 Altmetric
Explore all metrics

Abstract

A large number of cross-references to various bodies of text are used in legal texts, each serving a different purpose. It is often necessary for authorities and companies to look into certain types of these citations. Yet, there is a lack of automatic tools to aid in this process. Recently, citation graphs have been used to improve the intelligibility of complex rule frameworks. We propose an algorithm that builds the citation graph from a document and automatically labels each edge according to its purpose. Our method uses the citing text only and thus works only on citations who’s purpose can be uniquely identified by their surrounding text. This framework is then applied to the US code. This paper includes defining and evaluating a standard gold set of labels that cover a vast majority of citation types which appear in the “US Code” but are still short enough for practical use. We also proposed a novel linear-chain conditional random field model that extracts the features required for labeling the citations from the surrounding text. We then analyzed the effectiveness of different clustering methods such as K-means and support vector machine to automatically label each citation with the corresponding label. Besides this, we talk about the practical difficulties of this task and give a comparison of human accuracy compared to our end-to-end algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Testing of detection tools for AI-generated text

Article Open access 25 December 2023

Evaluating the efficacy of AI content detection tools in differentiating between human and AI-generated text

Article Open access 01 September 2023

Zombie cheminformatics: extraction and conversion of Wiswesser Line Notation (WLN) from chemical documents

Article Open access 15 April 2024

Notes

https://www.law.cornell.edu/uscode/text.
This dataset was also obtained during the annotation process, but lacked a semantic label for the citations. This reduces the chances of over fitting because the predicate extraction is learned on a different dataset than the dataset used for training the label classifier.

References

Adedjouma M, Sabetzadeh M, Briand LC (2014) Automated detection and resolution of legal cross references: approach and a study of luxembourg’s legislation. In: Requirements Engineering Conference (RE), 2014 IEEE 22nd International. IEEE, pp 63–72
Alonso O, Mizzaro S (2012) Using crowdsourcing for TREC relevance assessment. Inf Process Manag 48:1053–1066
Article Google Scholar
Amir Sadeghian A, Alahi A, Savarese S (2017) Tracking the untrackable: learning to track multiple cues with long-term dependencies. arXiv preprint arXiv:1701.01909
Ashley K, Bjerke E, Potter M, Guclu H (2014) Statutory network analysis plus information retrieval. In: Proceedings of Second Workshop on Network Analysis in Law at the 27th Annual Conference on Legal Knowledge and Information Systems. NAil, pp 1–7
Association HLR (1996) The bluebook: a uniform system of citation. Harvard Law Review Association, Cambridge
Google Scholar
Bird S, Klein E, Loper E (2009) Natural language processing with Python. O’Reilly Media Inc, Sebastopol
MATH Google Scholar
Branting LK (2017) Data-centric and logic-based models for automated legal problem solving. Artif Intell Law 25(1):5–27
Article Google Scholar
Breaux TD, Antón AI (2007) A systematic method for acquiring regulatory requirements: a frame-based approach. In: RHAS-6, Delhi, India
Cao Z, Yu S, Ouyang B, Dalgleish F, Vuorenkoski A, Alsenas G, Principe J (2017) Marine animal classification with correntropy loss based multi-view learning. arXiv preprint arXiv:1705.01217
Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20:37–46
Article Google Scholar
Cornell Law School US Code. https://www.law.cornell.edu/uscode/text
de Maat E, Winkels R, van Engers T (2006) Automated detection of reference structures in law. In: van Engers TM (ed) Legal knowledge and information systems. Jurix 2006: the nineteenth annual conference. Frontiers in artificial intelligence and applications, vol 152. IOS Press, pp 41–50
de Maat E, Winkels R, van Engers T (2009) Making sense of legal texts. Form. Linguist. Law 212:225
Google Scholar
Galgani F, Hoffmann A (2010) Lexa: towards automatic legal citation classification. In: AI 2010—Advances in Artificial Intelligence. Springer, Berlin, pp 445–454
Glaser B, Strauss A (1967) The discovery grounded theory: strategies for qualitative inquiry. Aldin, Chicago
Google Scholar
Hamdaqa M, Hamou-Lhadj A (2009) Citation analysis: an approach for facilitating the understanding and the analysis of regulatory compliance documents. In: Sixth International Conference on Information Technology—New Generations, 2009. ITNG’09. IEEE, pp 278–283
Hamdaqa M, Hamou-Lhadj A (2011) An approach based on citation analysis to support effective handling of regulatory compliance. Future Gener Comput Syst 27:395–410
Article Google Scholar
Harrington WG (1984) Brief history of computer-assisted legal research. Law Libr J 77:543
Google Scholar
Jain A, Lopez-Aguilera E, Demirkol I (2017) Mobility management as a service for 5G networks. arXiv preprint arXiv:1705.09101
Lafferty JD, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 282–289
Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174
Article MATH Google Scholar
Maxwell JC, Antón AI, Swire P, Riaz M, McCraw CM (2012) A legal cross-references taxonomy for reasoning about compliance requirements. Requir Eng 17:99–115
Article Google Scholar
Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013b) Distributed representations of words and phrases and their compositionality. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems 26. Curran Associates, Inc., pp 3111–3119
Mollalo A, Alimohammadi A, Shirzadi M, Malek M (2015) Geographic information system-based analysis of the spatial and spatio-temporal distribution of zoonotic cutaneous leishmaniasis in Golestan Province, north-east of Iran. Zoonoses Public Health 62:18–28
Google Scholar
Neale T (2013) Citation analysis of canadian case law. J. Open Access L. 1:1
Google Scholar
Pollman T, Kane LA (2000) ALWD citation manual: a professional system of citation. UNLV School of Law, Las Vegas
Google Scholar
Prakken H (1993) A logical framework for modelling legal argument. In: Proceedings of the 4th International Conference on Artificial Intelligence and Law. ACM, pp 1–9
Rissland E (1988) Artificial intelligence and legal reasoning: a discussion of the field and gardner’s book. AI Mag 9:45
Google Scholar
Rodrıguez M, Goldberg S, Wang DZ (2016) Consensus maximization fusion of probabilistic information extractors. In: Proceedings of NAACL-HLT, pp 1208–1216
Roitblat HL, Kershaw A, Oot P (2010) Document categorization in legal electronic discovery: computer classification versus manual review. J Am Soc Inf Sci Technol 61:70–80
Article Google Scholar
Sadeghian A, Lim D, Karlsson J, Li J (2015) Automatic target recognition using discrimination based on optimal transport. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 2604–2608
Sadeghian A, Sundaram L, Wang D, Hamilton W, Branting K, Pfeifer C (2016) Semantic edge labeling over legal citation graphs. In: LTDCA
Sharghi A, Laurel JS, Gong B (2017) Query-focused video summarization: dataset, evaluation, and a memory network based approach. arXiv preprint arXiv:1707.04960
Sutton C, McCallum A (2006) An introduction to conditional random fields for relational learning, vol 2. Introduction to statistical relational learning. MIT Press
Tran OT, Ngo BX, Le Nguyen M, Shimazu A (2014) Automated reference resolution in legal texts. Artif Intell Law 22:29–60
Article Google Scholar
Winkels R, Boer A, Vredebregt B, van Someren A (2014) Towards a legal recommender system. In: JURIX
Zhang P, Koppaka L (2007) Semantics-based legal citation network. In: Proceedings of the 11th International Conference on Artificial Intelligence and Law. ACM, pp 123–130

Download references

Acknowledgements

We thank two anonymous reviewers for their insightful feed back, which helped us improve this manuscript. In addition the authors would like to thank Vironica I Brown, Roman Diveev, Max Goldstein, Eva L Lauer, Nicholas W Long, Paul J Punzone and Joseph M Ragukonis for their contributions in the annotation process. We would also like to thank Benjamin Grider for his help in designing the graphical user interface for our system.

Author information

Authors and Affiliations

University of Florida, Gainesville, FL, USA
Ali Sadeghian, Laksshman Sundaram, Daisy Zhe Wang & William F. Hamilton
Stanford University, Stanford, CA, USA
Laksshman Sundaram
MITRE Corp., McLean, VA, USA
Karl Branting & Craig Pfeifer

Authors

Ali Sadeghian
View author publications
You can also search for this author in PubMed Google Scholar
Laksshman Sundaram
View author publications
You can also search for this author in PubMed Google Scholar
Daisy Zhe Wang
View author publications
You can also search for this author in PubMed Google Scholar
William F. Hamilton
View author publications
You can also search for this author in PubMed Google Scholar
Karl Branting
View author publications
You can also search for this author in PubMed Google Scholar
Craig Pfeifer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ali Sadeghian.

Additional information

This work is partially supported by UF CISE Data Science Research Lab, UF Law School and ICAIR Program.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sadeghian, A., Sundaram, L., Wang, D.Z. et al. Automatic semantic edge labeling over legal citation graphs. Artif Intell Law 26, 127–144 (2018). https://doi.org/10.1007/s10506-018-9217-1

Download citation

Published: 01 March 2018
Issue Date: June 2018
DOI: https://doi.org/10.1007/s10506-018-9217-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic semantic edge labeling over legal citation graphs

Abstract

Access this article

Similar content being viewed by others

Testing of detection tools for AI-generated text

Evaluating the efficacy of AI content detection tools in differentiating between human and AI-generated text

Zombie cheminformatics: extraction and conversion of Wiswesser Line Notation (WLN) from chemical documents

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automatic semantic edge labeling over legal citation graphs

Abstract

Access this article

Similar content being viewed by others

Testing of detection tools for AI-generated text

Evaluating the efficacy of AI content detection tools in differentiating between human and AI-generated text

Zombie cheminformatics: extraction and conversion of Wiswesser Line Notation (WLN) from chemical documents

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation