Abstract
Lawmakers in the ASEAN countries need to investigate statutes of neighbor countries to draft consistent, uniform, and reasonable statutes. Moreover, the non-lawyers, who would like to invest or work oversea, should understand the statutes of the countries under consideration and compare the regulation requirements before making decision which country is good for investment or for working. This work proposes a platform for collecting and comparing laws. It consists of three modules: the first one is a Web crawling for gathering the statutes from ASEAN countries’ law archives, the second module is Document preprocessing for extracting the regulations from each statute of each country and aligning them across the text, and the last module is a service with a tool for highlighting the relevant parts of text. This paper proposes to use existing text processing tools, such as, word/word-group segmentation and document section parsing, to use Wikidata’s ontological concept for annotating those entities, and then align them across the text. However, there are two problems of concept selection, i.e. concept ambiguity and concept granularity. A near-threshold of maximum distance to the least common ancestor is computed for selecting a proper concept for entity alignment. This work did an experiment on Malaysia and Thailand’s labor law to compare the minimum wages. By testing with a several of thresholds, the threshold value two gives the most proper concept where the precision and recall of related entities alignment are 48% and 67%, respectively.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
ASEAN Legal Database (2019). http://asean-law.senate.go.th. Accessed 29 Mar 2019
Attorney General’s Chambers (2019). http://www.agc.gov.bn/AGC Site Pages/INDEX TO THE LAWS OF BRUNEI.aspx. Accessed 29 Mar 2019
Industrial-Strength Natural Language Processing (2019). https://spacy.io/. Accessed 29 Mar 2019
Krisdika (2019). http://www.krisdika.go.th/. Accessed 29 Mar 2019
Lembaran Negara (2019). http://ditjenpp.kemenkumham.go.id/kerja/lnnew.php. Accessed 10 July 2019
Mundaneum (2019). https://github.com/jackrusher/mundaneum. Accessed 22 Sept 2019
Nokogiri (2019). https://nokogiri.org/. Accessed 29 Mar 2019
Official Gazette (2019). http://vietnamlawmagazine.vn/gazette.html. Accessed 29 Mar 2019
Official Portal Attorney General’s Chambers of Malaysia (2019). http://www.agc.gov.my/agcportal. Accessed 29 Mar 2019
Optical Character Recognition (OCR): Tutorial—cloud functions document—Google cloud (2019). https://cloud.google.com/functions/docs/tutorials/ocr. Accessed 29 Mar 2019
Singapore statutes online (2019). https://sso.agc.gov.sg/. Accessed 29 Mar 2019
Socialist Republic of Vietname Government Portal (2019). http://congbao.chinhphu.vn/cong-bao-nam-2019. Accessed 28 Sept 2019
Thai word segmentation library in Rust (2019). https://github.com/veer66/chamkho. Accessed 29 Mar 2019
The national assembly of the Lao people’s democratic republic (2019). http://www.na.gov.la/. Accessed 29 Mar 2019
The official Gazette of the Republic of the Philipines (2019). https://www.officialgazette.gov.ph. Accessed 29 Mar 2019
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Dev, S.: Slimerjs (2019). https://slimerjs.org/. Accessed 12 July 2019
Eberle, E.J.: The method and role of comparative law. Wash. Univ. Glob. Stud. Law Rev. 8, 451 (2009)
Harris, S., Seaborne, A., Prud’hommeaux, E.: SPARQL 1.1 query language. W3C recommendation (2013). Accessed 23 Sept 2019
Iacobacci, I., Pilehvar, M.T., Navigli, R.: Embeddings for word sense disambiguation: an evaluation study. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 897–907 (2016)
Koehn, P., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, pp. 177–180 (2007)
Mozilla: Mozilla Firefox (2019). https://www.mozilla.org/th/. Accessed 12 July 2019
Tiedemann, J.: Parallel data, tools and interfaces in OPUS. In: LREC, vol. 2012, pp. 2214–2218 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Satayamas, V., Kawtrakul, A., Yamakoshi, T. (2020). A Platform Development for Multilingual Law Collection and Comparative-Law Support Services: ASEAN Laws as a Case Study. In: Flouris, G., Laurent, D., Plexousakis, D., Spyratos, N., Tanaka, Y. (eds) Information Search, Integration, and Personalization. ISIP 2019. Communications in Computer and Information Science, vol 1197. Springer, Cham. https://doi.org/10.1007/978-3-030-44900-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-44900-1_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44899-8
Online ISBN: 978-3-030-44900-1
eBook Packages: Computer ScienceComputer Science (R0)