Statistical Recognition of References in Czech Court Decisions

Kríž, Vincent; Hladká, Barbora; Dědek, Jan; Nečaský, Martin

doi:10.1007/978-3-319-13647-9_6

Statistical Recognition of References in Czech Court Decisions

Vincent Kríž²²,
Barbora Hladká²²,
Jan Dědek²³ &
…
Martin Nečaský²³

Conference paper

1749 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8856))

Abstract

We address the task of detection and classification of references in Czech court decisions, mainly we focus on references to other court decisions and acts. In addition, we are interested in detection of institutions that issued documents under consideration. We handle these references like entities in the task of Named Entity Recognition. We approach the task using machine learning methods, namely HMM and Perceptron algorithm and we report F-measure over 90% averaged over all entities. The results significantly outperform the systems published previously.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Gantz, J., Reinsel, D.: The digital universe decade – are you ready (2010), http://goo.gl/ZaO0PR
Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning, pp. 147–155. Association for Computational Linguistics (2009)
Google Scholar
Quaresma, P., Gonçalves, T.: Using linguistic information and machine learning techniques to identify entities from juridical documents. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds.) Semantic Processing of Legal Texts. LNCS, vol. 6036, pp. 44–59. Springer, Heidelberg (2010)
Chapter Google Scholar
Dozier, C., Kondadadi, R., Light, M., Vachher, A., Veeramachaneni, S., Wudali, R.: Named entity recognition and resolution in legal text. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds.) Semantic Processing of Legal Texts. LNCS, vol. 6036, pp. 27–43. Springer, Heidelberg (2010)
Chapter Google Scholar
de Maat, E., Winkels, R., van Engers, T.M.: Automated detection of reference structures in law. In: van Engers, T.M. (ed.) JURIX. Frontiers in Artificial Intelligence and Applications, vol. 152, pp. 41–50. IOS Press (2006)
Google Scholar
Palmirani, M., Brighi, R., Massini, M.: Automated extraction of normative references in legal texts. In: Proceedings of the 9th International Conference on Artificial Intelligence and Law, pp. 105–106. ACM (2003)
Google Scholar
Bruckschen, M., Northfleet, C., Silva, D., Bridi, P., Granada, R., Vieira, R., Rao, P., Sander, T.: Named entity recognition in the legal domain for ontology population. In: Workshop Programme, p. 16 (2010)
Google Scholar
Quaresma, P., Gonçalves, T.: Using linguistic information and machine learning techniques to identify entities from juridical documents. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds.) Semantic Processing of Legal Texts. LNCS, vol. 6036, pp. 44–59. Springer, Heidelberg (2010)
Chapter Google Scholar
Bacci, L., Francesconi, E., Sagri, M.: A rule-based parsing approach for detecting case law references in italian court decisions. In: Semantic Processing of Legal Texts (SPLeT-2012) Workshop Programme, p. 27 (2012)
Google Scholar
De, E., Winkels, R., van Engers, T.: Automated detection of reference structures in law. In: Frontiers in Artificial Intelligence and Applications, p. 41 (2006)
Google Scholar
Tjong Kim Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, vol. 4, pp. 142–147. Association for Computational Linguistics (2003)
Google Scholar
Suzuki, J., Isozaki, H.: Semi-supervised sequential labeling and segmentation using giga-word scale unlabeled data. In: ACL, pp. 665–673. Citeseer (2008)
Google Scholar
Ando, R.K., Zhang, T.: A high-performance semi-supervised learning method for text chunking. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 1–9. Association for Computational Linguistics (2005)
Google Scholar
Straková, J., Straka, M., Hajič, J.: A new state-of-the-art czech named entity recognizer. In: Habernal, I., Matousek, V. (eds.) TSD 2013. LNCS, vol. 8082, pp. 68–75. Springer, Heidelberg (2013)
Google Scholar
Konkol, M., Konopík, M.: Maximum entropy named entity recognition for czech language. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 203–210. Springer, Heidelberg (2011)
Chapter Google Scholar
de Maat, E., Krabben, K., Winkels, R.: Machine Learning versus Knowledge Based Classification of Legal Texts. In: Proceedings of the 2010 Conference on Legal Knowledge and Information Systems: JURIX 2010: The Twenty-Third Annual Conference, pp. 87–96. IOS Press, Amsterdam (2010)
Google Scholar
Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: brat: a web-based tool for nlp-assisted text annotation. In: Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 102–107. Association for Computational Linguistics (2012)
Google Scholar
Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychological Bulletin 76, 378 (1971)
Article Google Scholar
Carletta, J.: Assessing agreement on classification tasks: the kappa statistic. Computational linguistics 22, 249–254 (1996)
Google Scholar
Li, Y., Zaragoza, H., Herbrich, R., Shawe-Taylor, J., Kandola, J.S.: The perceptron algorithm with uneven margins. In: Proceedings of the Nineteenth International Conference on Machine Learning, ICML 2002, pp. 379–386. Morgan Kaufmann Publishers Inc., San Francisco (2002)
Google Scholar
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. In: Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics, ACL 2002 (2002)
Google Scholar
Kim, K.-B., Kim, S., Joo, Y., Oh, A.-S.: Enhanced fuzzy single layer perceptron. In: Wang, J., Liao, X.-F., Yi, Z. (eds.) ISNN 2005. LNCS, vol. 3496, pp. 603–608. Springer, Heidelberg (2005)
Chapter Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)
MATH Google Scholar
Li, Y., Bontcheva, K., Cunningham, H.: Using uneven margins svm and perceptron for information extraction. In: Proceedings of the Ninth Conference on Computational Natural Language Learning, pp. 72–79. Association for Computational Linguistics (2005)
Google Scholar
Merialdo, B.: Tagging english text with a probabilistic model. Comput. Linguist. 20, 155–171 (1994)
Google Scholar
Bikel, D.M., Miller, S., Schwartz, R., Weischedel, R.: Nymble: a high-performance learning name-finder. In: Proceedings of the Fifth Conference on Applied Natural Language Processing, pp. 194–201. Association for Computational Linguistics (1997)
Google Scholar
Nadeau, C., Bengio, Y.: Inference for the generalization error. Machine Learning 52, 239–281 (2003)
Article MATH Google Scholar
Berners-Lee, T.: Linked data - design issues. W3C (2006)
Google Scholar
Lassila, O., Swick, R.R.: Resource description framework (RDF) model and syntax specification. Technical report (1999), http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/

Download references

Author information

Authors and Affiliations

Institute of Formal and Applied Linguistics, Charles University in Prague, Malostranské nám. 25, 118 00, Praha 1, Czech Republic
Vincent Kríž & Barbora Hladká
Department of Software Engineering , Faculty of Mathematics and Physics, Charles University in Prague, Malostranské nám. 25, 118 00, Praha 1, Czech Republic
Jan Dědek & Martin Nečaský

Authors

Vincent Kríž
View author publications
You can also search for this author in PubMed Google Scholar
Barbora Hladká
View author publications
You can also search for this author in PubMed Google Scholar
Jan Dědek
View author publications
You can also search for this author in PubMed Google Scholar
Martin Nečaský
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centro de Investigación en Computación, Instituto Politécnico Nacional, Av. Juan Dios Bátiz s/n, Col. Nueva Industrial Vallejo, 07738, Mexico City, Mexico
Alexander Gelbukh
Área Académica de Computación y Electrónica, Carretera Pachuca-Tulancingo, Universidad Autónoma del Estado de Hidalgo, Km. 4.5, Col. Carboneras, Mineral de la Reforma, 42180, Hidalgo, Mexico
Félix Castro Espinoza
Facultad de ciencias, Universidad Autónoma Nacional de México, Ciudad Universitaria, México DF, Mexico
Sofía N. Galicia-Haro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kríž, V., Hladká, B., Dědek, J., Nečaský, M. (2014). Statistical Recognition of References in Czech Court Decisions. In: Gelbukh, A., Espinoza, F.C., Galicia-Haro, S.N. (eds) Human-Inspired Computing and Its Applications. MICAI 2014. Lecture Notes in Computer Science(), vol 8856. Springer, Cham. https://doi.org/10.1007/978-3-319-13647-9_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-13647-9_6
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13646-2
Online ISBN: 978-3-319-13647-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics