Skip to main content

Advertisement

Log in

Automatic semantic edge labeling over legal citation graphs

  • Published:
Artificial Intelligence and Law Aims and scope Submit manuscript

Abstract

A large number of cross-references to various bodies of text are used in legal texts, each serving a different purpose. It is often necessary for authorities and companies to look into certain types of these citations. Yet, there is a lack of automatic tools to aid in this process. Recently, citation graphs have been used to improve the intelligibility of complex rule frameworks. We propose an algorithm that builds the citation graph from a document and automatically labels each edge according to its purpose. Our method uses the citing text only and thus works only on citations who’s purpose can be uniquely identified by their surrounding text. This framework is then applied to the US code. This paper includes defining and evaluating a standard gold set of labels that cover a vast majority of citation types which appear in the “US Code” but are still short enough for practical use. We also proposed a novel linear-chain conditional random field model that extracts the features required for labeling the citations from the surrounding text. We then analyzed the effectiveness of different clustering methods such as K-means and support vector machine to automatically label each citation with the corresponding label. Besides this, we talk about the practical difficulties of this task and give a comparison of human accuracy compared to our end-to-end algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. https://www.law.cornell.edu/uscode/text.

  2. This dataset was also obtained during the annotation process, but lacked a semantic label for the citations. This reduces the chances of over fitting because the predicate extraction is learned on a different dataset than the dataset used for training the label classifier.

References

  • Adedjouma M, Sabetzadeh M, Briand LC (2014) Automated detection and resolution of legal cross references: approach and a study of luxembourg’s legislation. In: Requirements Engineering Conference (RE), 2014 IEEE 22nd International. IEEE, pp 63–72

  • Alonso O, Mizzaro S (2012) Using crowdsourcing for TREC relevance assessment. Inf Process Manag 48:1053–1066

    Article  Google Scholar 

  • Amir Sadeghian A, Alahi A, Savarese S (2017) Tracking the untrackable: learning to track multiple cues with long-term dependencies. arXiv preprint arXiv:1701.01909

  • Ashley K, Bjerke E, Potter M, Guclu H (2014) Statutory network analysis plus information retrieval. In: Proceedings of Second Workshop on Network Analysis in Law at the 27th Annual Conference on Legal Knowledge and Information Systems. NAil, pp 1–7

  • Association HLR (1996) The bluebook: a uniform system of citation. Harvard Law Review Association, Cambridge

    Google Scholar 

  • Bird S, Klein E, Loper E (2009) Natural language processing with Python. O’Reilly Media Inc, Sebastopol

    MATH  Google Scholar 

  • Branting LK (2017) Data-centric and logic-based models for automated legal problem solving. Artif Intell Law 25(1):5–27

    Article  Google Scholar 

  • Breaux TD, Antón AI (2007) A systematic method for acquiring regulatory requirements: a frame-based approach. In: RHAS-6, Delhi, India

  • Cao Z, Yu S, Ouyang B, Dalgleish F, Vuorenkoski A, Alsenas G, Principe J (2017) Marine animal classification with correntropy loss based multi-view learning. arXiv preprint arXiv:1705.01217

  • Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20:37–46

    Article  Google Scholar 

  • Cornell Law School US Code. https://www.law.cornell.edu/uscode/text

  • de Maat E, Winkels R, van Engers T (2006) Automated detection of reference structures in law. In: van Engers TM (ed) Legal knowledge and information systems. Jurix 2006: the nineteenth annual conference. Frontiers in artificial intelligence and applications, vol 152. IOS Press, pp 41–50

  • de Maat E, Winkels R, van Engers T (2009) Making sense of legal texts. Form. Linguist. Law 212:225

    Google Scholar 

  • Galgani F, Hoffmann A (2010) Lexa: towards automatic legal citation classification. In: AI 2010—Advances in Artificial Intelligence. Springer, Berlin, pp 445–454

  • Glaser B, Strauss A (1967) The discovery grounded theory: strategies for qualitative inquiry. Aldin, Chicago

    Google Scholar 

  • Hamdaqa M, Hamou-Lhadj A (2009) Citation analysis: an approach for facilitating the understanding and the analysis of regulatory compliance documents. In: Sixth International Conference on Information Technology—New Generations, 2009. ITNG’09. IEEE, pp 278–283

  • Hamdaqa M, Hamou-Lhadj A (2011) An approach based on citation analysis to support effective handling of regulatory compliance. Future Gener Comput Syst 27:395–410

    Article  Google Scholar 

  • Harrington WG (1984) Brief history of computer-assisted legal research. Law Libr J 77:543

    Google Scholar 

  • Jain A, Lopez-Aguilera E, Demirkol I (2017) Mobility management as a service for 5G networks. arXiv preprint arXiv:1705.09101

  • Lafferty JD, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 282–289

  • Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174

    Article  MATH  Google Scholar 

  • Maxwell JC, Antón AI, Swire P, Riaz M, McCraw CM (2012) A legal cross-references taxonomy for reasoning about compliance requirements. Requir Eng 17:99–115

    Article  Google Scholar 

  • Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781

  • Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013b) Distributed representations of words and phrases and their compositionality. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems 26. Curran Associates, Inc., pp 3111–3119

  • Mollalo A, Alimohammadi A, Shirzadi M, Malek M (2015) Geographic information system-based analysis of the spatial and spatio-temporal distribution of zoonotic cutaneous leishmaniasis in Golestan Province, north-east of Iran. Zoonoses Public Health 62:18–28

    Google Scholar 

  • Neale T (2013) Citation analysis of canadian case law. J. Open Access L. 1:1

    Google Scholar 

  • Pollman T, Kane LA (2000) ALWD citation manual: a professional system of citation. UNLV School of Law, Las Vegas

    Google Scholar 

  • Prakken H (1993) A logical framework for modelling legal argument. In: Proceedings of the 4th International Conference on Artificial Intelligence and Law. ACM, pp 1–9

  • Rissland E (1988) Artificial intelligence and legal reasoning: a discussion of the field and gardner’s book. AI Mag 9:45

    Google Scholar 

  • Rodrıguez M, Goldberg S, Wang DZ (2016) Consensus maximization fusion of probabilistic information extractors. In: Proceedings of NAACL-HLT, pp 1208–1216

  • Roitblat HL, Kershaw A, Oot P (2010) Document categorization in legal electronic discovery: computer classification versus manual review. J Am Soc Inf Sci Technol 61:70–80

    Article  Google Scholar 

  • Sadeghian A, Lim D, Karlsson J, Li J (2015) Automatic target recognition using discrimination based on optimal transport. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 2604–2608

  • Sadeghian A, Sundaram L, Wang D, Hamilton W, Branting K, Pfeifer C (2016) Semantic edge labeling over legal citation graphs. In: LTDCA

  • Sharghi A, Laurel JS, Gong B (2017) Query-focused video summarization: dataset, evaluation, and a memory network based approach. arXiv preprint arXiv:1707.04960

  • Sutton C, McCallum A (2006) An introduction to conditional random fields for relational learning, vol 2. Introduction to statistical relational learning. MIT Press

  • Tran OT, Ngo BX, Le Nguyen M, Shimazu A (2014) Automated reference resolution in legal texts. Artif Intell Law 22:29–60

    Article  Google Scholar 

  • Winkels R, Boer A, Vredebregt B, van Someren A (2014) Towards a legal recommender system. In: JURIX

  • Zhang P, Koppaka L (2007) Semantics-based legal citation network. In: Proceedings of the 11th International Conference on Artificial Intelligence and Law. ACM, pp 123–130

Download references

Acknowledgements

We thank two anonymous reviewers for their insightful feed back, which helped us improve this manuscript. In addition the authors would like to thank Vironica I Brown, Roman Diveev, Max Goldstein, Eva L Lauer, Nicholas W Long, Paul J Punzone and Joseph M Ragukonis for their contributions in the annotation process. We would also like to thank Benjamin Grider for his help in designing the graphical user interface for our system.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ali Sadeghian.

Additional information

This work is partially supported by UF CISE Data Science Research Lab, UF Law School and ICAIR Program.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sadeghian, A., Sundaram, L., Wang, D.Z. et al. Automatic semantic edge labeling over legal citation graphs. Artif Intell Law 26, 127–144 (2018). https://doi.org/10.1007/s10506-018-9217-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10506-018-9217-1

Keywords

Navigation