Abstract
Automatic reasoning about textual information is a challenging task in modern Natural Language Processing (NLP) systems. In this work we describe our proposal for representing and reasoning about Portuguese documents by means of Linked Data like ontologies and thesauri. Our approach resorts to a specialized pipeline of natural language processing (part-of-speech tagger, named entity recognition, semantic role labeling) to populate an ontology for the domain of criminal investigations. The provided architecture and ontology are language independent. Although some of the NLP modules are language dependent, they can be built using adequate AI methodologies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Automated event extraction model for multiple linked portuguese documents. https://github.com/kraiyani/Automated-Event-Extraction-Model-for-Multiple-Linked-Portuguese-Documents/blob/master/Universal_to_eagle_tagset.xlsx. Accessed 06 May 2019
Eu vocabularies. https://publications.europa.eu/en/web/eu-vocabularies. Accessed 06 May 2019
Eu vocabularies, thesauri, 1216 criminal law. https://publications.europa.eu/en/web/eu-vocabularies/th-concept-scheme/-/resource/eurovoc/100180?target=Browse. Accessed 06 May 2019
Extended ontology. http://owlgred.lumii.lv/online_visualization/e9fh. Accessed 25 June 2019
Graphdb. http://graphdb.ontotext.com/. Accessed 06 May 2019
Iate (interactive terminology for Europe). https://iate.europa.eu/home. Accessed 06 May 2019
Portuguese universal propositions. https://github.com/System-T/UniversalPropositions/tree/master/UP_Portuguese-Bosque. Accessed 06 May 2019
Protege. https://protege.stanford.edu/. Accessed 06 May 2019
Training and development dataset for automated event extraction model for multiple linked portuguese documents. https://github.com/kraiyani/Automated-Event-Extraction-Model-for-Multiple-Linked-Portuguese-Documents. Accessed 06 May 2019
Amato, F., Moscato, V., Picariello, A., Sperlì, G.: Extreme events management using multimedia social networks. Future Gener. Comp. Syst. 94, 444–452 (2019). https://doi.org/10.1016/j.future.2018.11.035
Brants, T.: TnT: a statistical part-of-speech tagger. In: Proceedings of the Sixth Conference on Applied Natural Language Processing, pp. 224–231. Association for Computational Linguistics (2000)
Cardoso, N.: Rembrandt - a named-entity recognition framework. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC-2012), pp. 1240–1243. European Language Resources Association (ELRA), Istanbul, May 2012. http://www.lrec-conf.org/proceedings/lrec2012/pdf/409_Paper.pdf
Carreras, X., Chao, I., Padró, L., Padro, M.: Freeling: an open-source suite of language analyzers. In: Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004) (2004)
Carreras, X., Màrquez, L., Padró, L.: A simple named entity extractor using AdaBoost. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003 (2003)
Guarino, N., Giaretta, P.: Ontologies and knowledge bases: towards a terminological clarification. In: Towards Very Large Knowledge Bases: Knowledge Building and Knowledge Sharing, pp. 25–32. IOS Press (1995)
Guarino, N., Oberle, D., Staab, S.: What Is an Ontology?, pp. 1–17, May 2009
Raiyani, K., Gonçalves, T., Quaresma, P., Nogueira, V.B.: Fully connected neural network with advance preprocessor to identify aggression over Facebook and Twitter. In: Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), pp. 28–41. Association for Computational Linguistics (2018). http://aclweb.org/anthology/W18-4404
Raiyani, K., Gonçalves, T., Quaresma, P., Nogueira, V.B.: Multi-language neural network model with advance preprocessor for gender classification over social media: notebook for PAN at CLEF 2018. In: Working Notes of CLEF 2018 - Conference and Labs of the Evaluation Forum, Avignon, France, September 10–14, 2018. (2018). http://ceur-ws.org/Vol-2125/paper_105.pdf
Raiyani, K., Gonçalves, T., Quaresma, P., Nogueira, V.B.: Automated event extraction model for linked Portuguese documents. In: Proceedings of Text2Story – Second Workshop on Narrative Extraction from Texts Co-located with 41th European Conference on Information Retrieval (ECIR 2019), Cologne, Germany, 14 April (2019). http://ceur-ws.org/Vol-2342/paper2.pdf
Raiyani, K., Gonçalves, T., Quaresma, P., Nogueira, V.B.: Vista.ue at semeval-2019 task 5: single multilingual hate speech detection model. In: Proceedings of the 13th International Workshop on Semantic Evaluation (SemEval-2019), pp. 520–524. Association for Computational Linguistics (2019)
Raiyani, K., Quaresma, P.: Keyword & machine learning based Japanese statute law retrieval and entailment task at COLIEE-2019. In: Proceedings of Competition on Legal Information Retrieval and Entailment Workshop (COLIEE 2019) in Association with the 17th International Conference on Artificial Intelligence and Law 2019 (ICAIL 2019). Easychair (2019)
Van Hage, W.R., Malaisé, V., Segers, R., Hollink, L., Schreiber, G.: Design and use of the simple event model (SEM). Web Semant. Sci. Serv. Agents World Wide Web 9(2), 128–136 (2011)
Acknowledgments
The authors would like to thank COMPETE 2020, PORTUGAL 2020 Program, the European Union, and ALENTEJO 2020 for supporting this research as part of Agatha Project SI & IDT number 18022 (Intelligent analysis system of open of sources information for surveillance/crime control). The authors would also like to thank LISP - Laboratory of Informatics, Systems and Parallelism.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Quaresma, P., Beires Nogueira, V., Raiyani, K., Bayot, R., Gonçalves, T. (2020). From Textual Information Sources to Linked Data in the Agatha Project. In: Hofstedt, P., Abreu, S., John, U., Kuchen, H., Seipel, D. (eds) Declarative Programming and Knowledge Management. INAP WLP WFLP 2019 2019 2019. Lecture Notes in Computer Science(), vol 12057. Springer, Cham. https://doi.org/10.1007/978-3-030-46714-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-46714-2_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46713-5
Online ISBN: 978-3-030-46714-2
eBook Packages: Computer ScienceComputer Science (R0)