Information extraction framework to build legislation network

Sakhaee, Neda; Wilson, Mark C.

doi:10.1007/s10506-020-09263-3

Information extraction framework to build legislation network

Original Research
Published: 28 January 2020

Volume 29, pages 35–58, (2021)
Cite this article

Artificial Intelligence and Law Aims and scope Submit manuscript

1070 Accesses
8 Citations
3 Altmetric
Explore all metrics

Abstract

This paper concerns an information extraction process for building a dynamic legislation network from legal documents. Unlike supervised learning approaches which require additional calculations, the idea here is to apply information extraction methodologies by identifying distinct expressions in legal text in order to extract network information. The study highlights the importance of data accuracy in network analysis and improves approximate string matching techniques to produce reliable network data-sets with more than 98% precision and recall. The applications and the complexity of the created dynamic legislation network are also discussed and challenged.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Combining Natural Language Processing Approaches for Rule Extraction from Legal Documents

Using the interest theory of rights and Hohfeldian taxonomy to address a gap in machine learning methods for legal document analysis

Article Open access 19 May 2023

Exploring the Challenges and Limitations of Unsupervised Machine Learning Approaches in Legal Concepts Discovery

Notes

Shepard’s Citations include a judicial history of cases and statutes.
For more details about MetaLex please refer to Boer et al. (2010).
To estimate this error rate, a cluster sampling method is used to randomly choose ten sets of 30 entities. By manual check of the samples, the rate of incorrectly matched entities is observed.
Time periods: before 1800, 1800–1850, 1850–1900, 1900–1950, 1950–2000, 2000–2018.
To find the frequent words, Textalyzer Python module is used. The frequent prepositions, conjunctions and articles are excluded from the analysis.
Based on their connectivity (total degree).

References

Albert R, Jeong H, Barabási A-L (2000) Error and attack tolerance of complex networks. Nature 406(6794):378
Article Google Scholar
Andersen PM, Hayes PJ, Huettner AK, Schmandt LM, Nirenburg IB, Weinstein SP (1992) Automatic extraction of facts from press releases to generate news stories. In: Proceedings of the third conference on applied natural language processing. Association for Computational Linguistics, pp 170–177
Alexander B, Hoekstra R, De Maat E, Vitali F, Palmirani M, Ratai B (2010) Metalex (open xml interchange format for legal and legislative resources). Management Center, Akon
Google Scholar
Borgatti SP, Carley KM, Krackhardt D (2006) On the robustness of centrality measures under conditions of imperfect data. Soc Netw 28(2):124–136
Article Google Scholar
Butts CT (2003) Network inference, error, and informant (in) accuracy: a Bayesian approach. Soc Netw 25(2):103–140
Article Google Scholar
Canisius S, Sporleder C (2007) Bootstrapping information extraction from field books. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL)
Carlson A, Schafer C (2008) Bootstrapping information extraction from semi-structured web pages. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, Berlin, pp 195–210
Casteigts A, Flocchini P, Quattrociocchi W, Santoro N (2012) Time-varying graphs and dynamic networks. Int J Parallel Emergent Distrib Syst 27(5):387–408
Article Google Scholar
Chiticariu L, Li Y, Reiss FR (2013) Rule-based information extraction is dead! long live rule-based information extraction systems! In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 827–832
Cohen KB, Demner-Fushman D (2014) Biomedical natural language processing, vol 11. John Benjamins Publishing Company, Amsterdam
Book Google Scholar
Cohen W, Ravikumar P, Fienberg S (2003) A comparison of string metrics for matching names and records. In: KDD workshop on data cleaning and object consolidation, vol 3, pp 73–78
Damerau FJ (1964) A technique for computer detection and correction of spelling errors. Commun ACM 7(3):171–176
Article Google Scholar
De Maat E, Winkels R, van Engers T (2006) Automated detection of reference structures in law. Frontiers in artificial intelligence and applications. IOS Press, Amsterdam, p 41
Google Scholar
EUR-Lex (2020) Access to European Union law. https://eur-lex.europa.eu/homepage.html. Accessed 10 Sept 2017
Fowler JH, Johnson TR, Spriggs JF, Jeon S, Wahlbeck PJ (2007) Network analysis and the law: measuring the legal importance of precedents at the US supreme court. Polit Anal 15(3):324–346
Article Google Scholar
Freitag D (2000) Machine learning for information extraction in informal domains. Mach Learn 39(2–3):169–202
Article Google Scholar
Gultemen D, van Engers T (2013) Graph-based linking and visualization for legislation documents (glvd). In: Network analysis in law workshop, at ICAIL 2013: XIV international conference on AI and law, NAiL2013 ICAIL, Rome, Italy, 14 June
Hafner CD (1978) An information retrieval system based on a computer model of legal knowledge. UMI Research Press, Ann Arbor, MI
Google Scholar
Hall PAV, Dowling GR (1980) Approximate string matching. ACM Comput Surv (CSUR) 12(4):381–402
Article MathSciNet Google Scholar
Hearst MA (1992) Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th conference on computational linguistics, vol 2. Association for Computational Linguistics, pp 539–545
Humphries MD, Gurney K (2008) Network small-world-ness: a quantitative method for determining canonical network equivalence. PLoS ONE 3(4):e0002051
Article Google Scholar
Jurafsky D, Martin JH (2014) Speech and language processing, vol 3. Pearson, London
Google Scholar
Kartoun U (2017) Text nailing: an efficient human-in-the-loop text-processing method. Interactions 24(6):44–49
Article Google Scholar
Koniaris M, Anagnostopoulos I, Vassiliou Y (2017) Network analysis in the legal domain: a complex model for European Union legal sources. J Complex Netw 6(2):243–268
Article Google Scholar
Krallinger M, Leitner F, Rabal O, Vazquez M, Oyarzabal J, Valencia A (2013) Overview of the chemical compound and drug name recognition (chemdner) task. In: BioCreative challenge evaluation workshop, vol 2, p 2
Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl 10:707–710
MathSciNet Google Scholar
McCallum A (2005) Information extraction: distilling structured data from unstructured text. Queue 3(9):4
Article Google Scholar
Mendelson E (2008) Abbyy finereader professional 9.0. PC Magazine
Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv CSUR) 33(1):31–88
Article Google Scholar
New Zealand Legal Information Institute (2020) Free access to legal information in New Zealand. http://www.nzlii.org. Accessed 31 Oct 2018
New Zealand Parliamentary Counsel Office (2020) The authoritative source of New Zealand legislation. http://www.legislation.govt.nz. Accessed 31 Oct 2018
Niu Q, Zeng A, Fan Y, Di Z (2015) Robustness of centrality measures against network manipulation. Physica A 438:124–131
Article Google Scholar
Pasula H, Marthi B, Milch B, Russell SJ, Shpitser I (2003) Identity uncertainty and citation matching. In: Advances in neural information processing systems, pp 1425–1432
Philips L (1990) Hanging on the metaphone. Comput Lang 7(12):39–43
Google Scholar
Sakhaee N (2018) Leginet New Zealand, first outcome of the new information extraction framework proposed to build legislation network. https://doi.org/10.7910/dvn/ib3qsf. Published 21 Sept 2018
Sakhaee N, Wilson M, Hendy S, Zakeri G (2017) Network analysis of New Zealand legislation. NZ Law J 10:332–337
Google Scholar
Sakhaee N, Wilson MC, Zakeri G (2016) New Zealand legislation network. In: Legal knowledge and information systems: JURIX 2016: the twenty-ninth annual conference, vol 294. IOS Press, p 199
Tabak BM, Takami M, Rocha JMC, Cajueiro DO, Souza SRS (2014) Directed clustering coefficient as ameasure of systemic risk in complex banking networks. Phys A Stat Mech Appl 394:211–216
Article Google Scholar
Tin CT, Jeffrey LC, Mark DT, Kenneth GY, Rachel E (2009) Information extraction from legal documents. In: 2009 eighth international symposium on natural language processing
Trier OD, Jain AK, Taxt T et al (1996) Feature extraction methods for character recognition-a survey. Pattern Recognit 29(4):641–662
Article Google Scholar
Ukkonen E (1992) Approximate string-matching with q-grams and maximal matches. Theor Comput Sci 92(1):191–211
Article MathSciNet Google Scholar
Watts DJ (2004) Small worlds: the dynamics of networks between order and randomness, vol 9. Princeton University Press, Princeton
MATH Google Scholar
Watts DJ, Strogatz SH (1998) Collective dynamics of small-world networks. Nature 393(6684):440
Article Google Scholar
Winkler WE (1999) The state of record linkage and current research problems. Statistical Research Division, US Census Bureau, Suitland
Google Scholar
Zhang P, Koppaka L (2007) Semantics-based legal citation network. In: Proceedings of the 11th international conference on artificial intelligence and law. ACM, pp 123–130
Zhang Y, Patrick J (2005) Paraphrase identification by text canonicalization. In: Proceedings of the Australasian language technology workshop, pp 160–166

Download references

Author information

Authors and Affiliations

University of Auckland, Auckland, New Zealand
Neda Sakhaee & Mark C. Wilson

Authors

Neda Sakhaee
View author publications
You can also search for this author in PubMed Google Scholar
Mark C. Wilson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Neda Sakhaee.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sakhaee, N., Wilson, M.C. Information extraction framework to build legislation network. Artif Intell Law 29, 35–58 (2021). https://doi.org/10.1007/s10506-020-09263-3

Download citation

Published: 28 January 2020
Issue Date: March 2021
DOI: https://doi.org/10.1007/s10506-020-09263-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Information extraction framework to build legislation network

Abstract

Access this article

Similar content being viewed by others

Combining Natural Language Processing Approaches for Rule Extraction from Legal Documents

Using the interest theory of rights and Hohfeldian taxonomy to address a gap in machine learning methods for legal document analysis

Exploring the Challenges and Limitations of Unsupervised Machine Learning Approaches in Legal Concepts Discovery

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Information extraction framework to build legislation network

Abstract

Access this article

Similar content being viewed by others

Combining Natural Language Processing Approaches for Rule Extraction from Legal Documents

Using the interest theory of rights and Hohfeldian taxonomy to address a gap in machine learning methods for legal document analysis

Exploring the Challenges and Limitations of Unsupervised Machine Learning Approaches in Legal Concepts Discovery

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation