Representing Annotated Texts as RDF

Cimiano, Philipp; Chiarcos, Christian; McCrae, John P.; Gracia, Jorge

doi:10.1007/978-3-030-30225-2_5

Philipp Cimiano⁵,
Christian Chiarcos⁶,
John P. McCrae ORCID: orcid.org/0000-0002-7227-1331⁷ &
…
Jorge Gracia⁸

800 Accesses
1 Citations

Abstract

Text annotation consists in defining markables (elements to be annotated), their features (attributes and values of annotations) and relations between markables (e.g. syntactic dependencies or semantic links). In this chapter we describe the principles for annotating text data using RDF-compliant formalisms. These principles provide the basis for making annotated corporate and text collections accessible from the LLOD ecosystem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

E. Hovy, M. Marcus, M. Palmer, L. Ramshaw, R. Weischedel, OntoNotes: the 90% solution, in Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology (HLT-NAACL 2006) (Association for Computational Linguistics, New York, 2006), pp. 57–60
Google Scholar
J. Nivre, Ž. Agić, L. Ahrenberg, et. al., Universal dependencies 1.4 (2016). http://hdl.handle.net/11234/1-1827
N. Ide, C. Chiarcos, M. Stede, S. Cassidy, Designing annotation schemes: from model to representation, in Handbook of Linguistic Annotation, ed. by N. Ide, J. Pustejovsky, Text, Speech, and Language Technology (Springer, Berlin, 2017)
Chapter Google Scholar
C. Chiarcos, Ontologies of linguistic annotation: survey and perspectives, in Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC), Istanbul, 2012, pp. 303–310
Google Scholar
K. Verspoor, K. Livingston, Towards adaptation of linguistic annotations to scholarly annotation formalisms on the Semantic Web, in Proceedings of the 6th Linguistic Annotation Workshop (Association for Computational Linguistics, Jeju, 2012), pp. 75–84
Google Scholar
L. Isaksen, R. Simon, E.T. Barker, P. de Soto Cañamares, Pelagios and the emerging graph of ancient world data, in Proceedings of the 2014 ACM Conference on Web Science (ACM, New York, 2014), pp. 197–201
Google Scholar
R. Sanderson, P. Ciccarese, B. Young, Web Annotation Data Model. Technical Report, W3C Recommendation (2017). https://www.w3.org/TR/annotation-model/
P. Ciccarese, M. Ocana, L.J. Garcia Castro, S. Das, T. Clark, An open annotation ontology for science on web 3.0, J. Biomed. Semant. 2(Suppl. 2), S4 (2011). https://doi.org/10.1186/2041-1480-2-S2-S4, http://www.jbiomedsem.com/content/2/S2/S4/abstract
Article Google Scholar
D.C. Comeau, R. Islamaj Doğan, P. Ciccarese, K.B. Cohen, M. Krallinger, F. Leitner, Z. Lu, Y. Peng, F. Rinaldi, M. Torii, et al., BioC: a minimalist approach to interoperability for biomedical text processing, Database 2013, bat064 (2013)
Google Scholar
R. Sanderson, P. Ciccarese, H. Van de Sompel, Designing the W3C Open Annotation data model, in Proceedings of the 5th Annual ACM Web Science Conference, WebSci ’13 (ACM, New York, 2013), pp. 366–375. https://doi.org/10.1145/2464464.2464474
Book Google Scholar
R. Sanderson, P. Ciccarese, B. Young, Web Annotation vocabulary. Technical Report, W3C Recommendation (2017). https://www.w3.org/TR/annotation-vocab/
P. Mendes, M. Jakob, A. García-Silva, C. Bizer, DBpedia Spotlight: shedding light on the web of documents, in Proceedings of the 7th International Conference on Semantic Systems (I-Semantics 2011), Graz, 2011
Google Scholar
S. Hellmann, NIF 2.0 Core Ontology. Technical Report, AKSW, University Leipzig (2015). http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core.html, version of 08-04-2015. Accessed 9 July 2019
E. Wilde, M. Duerst, RFC 5147 – URI fragment identifiers for the text/plain media type. Technical Report, Internet Engineering Task Force (IETF), Network Working Group (2008)
Google Scholar
N. Freed, N. Borenstein, RFC 2046 – Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types. Technical Report, Internet Engineering Task Force (IETF), Network Working Group (1996)
Google Scholar
P. Grosso, E. Maler, J. Marsh, N. Walsh, XPointer Framework. W3C Recommendation 25 March 2003. Technical Report, W3C (2003)
Google Scholar
A. Fokkens, A. Soroa, Z. Beloki, N. Ockeloen, G. Rigau, W.R. van Hage, P. Vossen, NAF and GAF: Linking linguistic annotations, in Proceedings of the 10th Joint ISO-ACL SIGSEM Workshop on Interoperable Semantic Annotation (2014), pp. 9–16
Google Scholar
N. Ide, K. Suderman, E. Nyberg, J. Pustejovsky, M. Verhagen, LAPPS/Galaxy: Current state and next steps, in Proceedings of the 3rd International Workshop on Worldwide Language Service Infrastructure and 2nd Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI/OIAF4HLT2016) (2016), pp. 11–18
Google Scholar
S. Hellmann, J. Lehmann, S. Auer, M. Brümmer, Integrating NLP using Linked Data, in Proceedings of the 12th International Semantic Web Conference, 21–25 October 2013, Sydney, 2013. Also see http://persistence.uni-leipzig.org/nlp2rdf/
M. Egner, M. Lorch, E. Biddle, UIMA Grid: Distributed large-scale text analysis, in Proceedings of the 7th IEEE International Symposium on Cluster Computing and the Grid (CCGRID’07), Rio de Janeiro, 2007, pp. 317–326
Google Scholar
H. Cunningham, GATE, a general architecture for text engineering. Comput. Hum. 36(2), 223 (2002)
Google Scholar
S. Hellmann, J. Lehmann, S. Auer, Linked-data aware URI schemes for referencing text fragments, in Proceedings of the International Conference on Knowledge Engineering and Knowledge Management (Springer, Berlin, 2012), pp. 175–184
Google Scholar
M. Davis, K. Whistler, Unicode Standard Annex #15. Unicode Normalization Forms. Technical Report, Unicode, Inc. (2017). Unicode 10.0.0, version of 2017-05-26, revision 45
Google Scholar
E. Brill, J. Wu, Classifier combination for improved lexical disambiguation, in Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and the 17th International Conference on Computational Linguistics (COLING-ACL 1998), Montréal, 1998, pp. 191–195
Google Scholar
M.P. Marcus, B. Santorini, M.A. Marcinkiewicz, Building a large annotated corpus of English: the Penn treebank. Comput. Linguist. 19, 313 (1993)
Google Scholar
S. Hellmann, M. Brümmer, M. Ackermann, Provenance and confidence for NIF annotations. Technical Report, AKSW, University of Leipzig, Germany (2016). Version of Oct 17, 2016
Google Scholar
E. Rubiera, L. Polo, D. Berrueta, A. El Ghali, TELIX: An RDF-based model for linguistic annotation, in Proceedings of the 9th Extended Semantic Web Conference (ESWC 2012), Heraklion, 2012
Google Scholar
A. Miles, S. Bechhofer, SKOS Simple Knowledge Organization System eXtension for Labels (SKOS-XL). Technical Report, W3C Recommendation (2009)
Google Scholar
R. Agerri, I. Aldabe, E. Laparra, G. Rigau Claramunt, A. Fokkens, P. Huijgen, R. Izquierdo Beviá, M. van Erp, P. Vossen, A.L. Minard, et al., Multilingual event detection using the NewsReader pipelines, in Proceedings of the Workshop on Cross-Platform Text Mining and Natural Language Processing Interoperability, collocated with International Conference on Language Resources and Evaluation (LREC) (2016)
Google Scholar
M. Verhagen, K. Suderman, D. Wang, N. Ide, C. Shi, J. Wright, J. Pustejovsky, The LAPPS Interchange Format, in Proceedings of the International Workshop on Worldwide Language Service Infrastructure (Springer, Berlin, 2015), pp. 33–47
Google Scholar
B. Bohnet, J. Kuhn, The best of both worlds: a graph-based completion model for transition-based parsers, in Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (Association for Computational Linguistics, Stroudsburg, 2012), pp. 77–87
Google Scholar
A. Gangemi, V. Presutti, D. Reforgiato Recupero, A.G. Nuzzolese, F. Draicchio, M. Mongiovì, Semantic Web machine reading with FRED Semantic Web 8(6), 873 (2017)
Article Google Scholar
R. Witte, B. Sateli, The LODeXporter: flexible generation of linked open data triples from NLP frameworks for automatic knowledge base construction, in Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC) (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Semantic Computing Group, Bielefeld University, Bielefeld, Germany
Philipp Cimiano
Angewandte Computerlinguistik, Goethe-University, Frankfurt am Main, Germany
Christian Chiarcos
Insight Centre for Data Analytics, National University of Ireland, Galway, Ireland
John P. McCrae
Aragon Institute of Engineering Research (I3A), University of Zaragoza, Zaragoza, Spain
Jorge Gracia

Authors

Philipp Cimiano
View author publications
You can also search for this author in PubMed Google Scholar
Christian Chiarcos
View author publications
You can also search for this author in PubMed Google Scholar
John P. McCrae
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Gracia
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cimiano, P., Chiarcos, C., McCrae, J.P., Gracia, J. (2020). Representing Annotated Texts as RDF. In: Linguistic Linked Data. Springer, Cham. https://doi.org/10.1007/978-3-030-30225-2_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-30225-2_5
Published: 14 January 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30224-5
Online ISBN: 978-3-030-30225-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics