Abstract
In this chapter we describe principles and architectures that support the development of NLP workflows and pipelines based on linked data technology. The benefit of NLP workflows that build on linked data standards is that they build on an open set of data models and Web technologies that can be implemented with standard functionality not requiring additional frameworks and thus avoiding any type of lock-in or dependence on particular frameworks in comparison to using UIMA, GATE or other frameworks. In this chapter we describe, on the one hand, how NLP workflows can be implemented by relying on the Natural Language Processing Interchange Format (NIF). We give examples of how a POS-tagger and a dependency parser can be implemented as NIF-based web services. We then describe Teanga, a recent platform for NLP integration that exploits Docker containers to implement NLP workflows. Finally, we also describe LAPPS Grid, an open-source platform for NLP tools that builds on JSON-LD.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
H. Cunningham, GATE, a general architecture for text engineering. Comput. Hum. 36(2), 223 (2002)
D. Ferrucci, A. Lally, UIMA: an architectural approach to unstructured information processing in the corporate research environment. Nat. Lang. Eng. 10(3-4), 327 (2004)
S. Hellmann, RLOG—an RDF Logging Ontology (AKSW/University Leipzig, Ontology, 2013). http://persistence.uni-leipzig.org/nlp2rdf/ontologies/rlog/rlog.html
S. Bird, NLTK: the natural language toolkit, in Proceedings of the COLING/ACL on Interactive presentation sessions (Association for Computational Linguistics, Stroudsburg, 2006), pp. 69–72
H. Ziad, J.P. McCrae, P. Buitelaar, Teanga: a linked data based platform for natural language processing, in Proceedings of the 11th Language Resource and Evaluation Conference (LREC) (2018)
F. Haupt, D. Karastoyanova, F. Leymann, B. Schroth, A model-driven approach for REST compliant services, in Proceedings of the IEEE International Conference on Web Services (ICWS) (IEEE, Piscataway, 2014), pp. 129–136
M. Sporny, D. Longley, G. Kellogg, M. Lanthaler, N. Lindström, JSON-LD 1.0, in W3C Recommendation (World Wide Web Consortium, Cambridge, 2014)
M. Verhagen, K. Suderman, D. Wang, N. Ide, C. Shi, J. Wright, J. Pustejovsky, The LAPPS interchange format, in Proceedings of the International Workshop on Worldwide Language Service Infrastructure (Springer, Berlin, 2015), pp. 33–47
N. Ide, K. Suderman, E. Nyberg, J. Pustejovsky, M. Verhagen, LAPPS/Galaxy: current state and next steps, in Proceedings of the 3rd International Workshop on Worldwide Language Service Infrastructure and 2nd Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI/OIAF4HLT2016) (2016), pp. 11–18
N. Ide, K. Suderman, M. Verhagen, J. Pustejovsky, The language application grid web service exchange vocabulary, in Proceedings of the International on Worldwide Language Service Infrastructure (Springer, Berlin, 2015), pp. 18–32
D. Ferrucci, E. Nyberg, J. Allan, K. Barker, E. Brown, J. Chu-Carroll, A. Ciccolo, P. Duboue, J. Fan, D. Gondek, et al., Towards the Open Advancement of Question Answering Systems (IBM, Armonk, 2009)
J. Goecks, A. Nekrutenko, J. Taylor, Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 11(8), R86 (2010)
T. Kluyver, B. Ragan-Kelley, F. Pérez, B.E. Granger, M. Bussonnier, J. Frederic, K. Kelley, J.B. Hamrick, J. Grout, S. Corlay, et al., Jupyter notebooks-a publishing format for reproducible computational workflows, in ELPUB (IOS Press, Amsterdam, 2016), pp. 87–90
N. Ide, K. Suderman, J. Pustejovsky, Demonstration: the language application grid as a platform for digital humanities research, in Proceedings of the Workshop on Corpora in the Digital Humanities (CDH 2017), Bloomington, IN, 19 January 2017. CEUR Workshop Proceedings 1786, CEUR-WS.org 2017, pp. 71–76
N. Ide, K. Suderman, J.D. Kim, Mining biomedical publications with the LAPPS grid., in Proceedings of the 11th Conference on International Language Resources and Evaluation (2018), pp. 2075–2018
D. Maynard, K. Bontcheva, I. Augenstein, Natural Language Processing for the Semantic Web. The Semantic Web: Theory and Technology (Morgan & Claypool, San Rafael, 2016)
C. Barrière, Natural Language Understanding in a Semantic Web Context (Springer, Berlin, 2016)
D. Jurafsky, J. Martin, Speech and Language Processing (Pearson, Harlow, 2014)
C. Manning, H. Schütze, Foundations of Statistical Natural Language Processing (MIT Press, Cambridge, 1999)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Cimiano, P., Chiarcos, C., McCrae, J.P., Gracia, J. (2020). Linked Data-Based NLP Workflows. In: Linguistic Linked Data. Springer, Cham. https://doi.org/10.1007/978-3-030-30225-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-30225-2_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30224-5
Online ISBN: 978-3-030-30225-2
eBook Packages: Computer ScienceComputer Science (R0)