Efficient High-Level Semantic Enrichment of Undocumented Enterprise Data

Schröder, Markus

doi:10.1007/978-3-030-32327-1_41

Markus Schröder^20,21

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11762))

Included in the following conference series:

European Semantic Web Conference

1015 Accesses

Abstract

In absence of a data management strategy, undocumented enterprise data piles up and becomes increasingly difficult for companies to use to its full potential. As a solution, we propose the enrichment of such data with meaning, or more precisely, the interlinking of data content with high-level semantic concepts. In contrast to low-level data lifting and mid-level information extraction, we would like to reach a high level of knowledge conceptualization. Currently, this can only be achieved if human experts are integrated into the enrichment process. Since human expertise is costly and limited, our methodology is designed to be as efficient as possible. That includes quantifying enrichment levels as well as assessing efficiency of gathering and exploiting user feedback. This paper proposes research on how semantic enrichment of undocumented enterprise data with humans in the loop can be conducted. We already got promising preliminary results from several projects in which we enriched various enterprise data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Ananiadou, S.: A methodology for automatic term recognition. In: The 15th International Conference on Computational Linguistics, COLING 1994, vol. 2, pp. 1034–1038 (1994)
Google Scholar
Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Sci. Am. 284(5), 34–43 (2001)
Article Google Scholar
Bouquet, P., Serafini, L., Zanobini, S., Sceffer, S.: Bootstrapping semantics on the web: meaning elicitation from schemas. In: WWW 2006, pp. 505–512 (2006)
Google Scholar
Brackenbury, W., et al.: Draining the data swamp: a similarity-based approach. In: Proceedings of the Workshop on Human-In-the-Loop Data Analytics, HILDA 2018. ACM (2018)
Google Scholar
Chortaras, A., Stamou, G.: D2RML: integrating heterogeneous data and web services into custom RDF graphs. In: Proceedings of the LDOW, vol. 2073. CEUR (2018)
Google Scholar
Clarke, M., Harley, P.: How smart is your content? Using semantic enrichment to improve your user experience and your bottom line. Sci. Editor 37(2), 41 (2014)
Google Scholar
Clarkson, K., Gentile, A.L., Gruhl, D., Ristoski, P., Terdiman, J., Welch, S.: User-centric ontology population. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 112–127. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_8
Chapter Google Scholar
Culotta, A., McCallum, A.: Reducing labeling effort for structured prediction tasks. In: AAAI, vol. 5, pp. 746–751 (2005)
Google Scholar
Enslen, E., Hill, E., Pollock, L., Vijay-Shanker, K.: Mining source code to automatically split identifiers for software analysis. In: 2009 6th IEEE International Working Conference on Mining Software Repositories, pp. 71–80 (2009)
Google Scholar
Figure Eight Inc.: Data scientist report 2018 (2018). https://www.figure-eight.com/figure-eight-2018-data-scientist-report/. Accessed 1st Feb 2019
Galkin, M., Auer, S., Scerri, S.: Enterprise knowledge graphs : a backbone of linked enterprise data. In: 2016 IEEE/WIC/ACM International Conference on Web Intelligence (2016)
Google Scholar
Galkin, M., Auer, S., Vidal, M.E., Scerri, S.: Enterprise knowledge graphs: a semantic approach for knowledge management in the next generation of enterprise information systems. In: Proceedings of the 19th International Conference on Enterprise Information Systems (ICEIS), vol. 2, pp. 88–98. SciTePress (2017)
Google Scholar
Hai, R., Geisler, S., Quix, C.: Constance: an intelligent data lake system. In: Proceedings of the 2016 ACM SIGMOD International Conference on Management of Data. ACM (2016)
Google Scholar
Halevy, A.Y., Franklin, M.J., Maier, D.: From databases to dataspaces: a new abstraction for information management. ACM Sigmod Rec. 34, 27–33 (2005)
Google Scholar
Hitzler, P., Krotzsch, M., Rudolph, S.: Foundations of Semantic Web Technologies. Chapman and Hall/CRC, Boca Raton (2009)
Book Google Scholar
Hlomani, H., Stacey, D.: Approaches, methods, metrics, measures, and subjectivity in ontology evaluation: a survey. Semant. Web J. 1(5), 1–11 (2014)
Google Scholar
Hua, W., Wang, Z., Wang, H., Zheng, K., Zhou, X.: Short text understanding through lexical-semantic analysis. In: 2015 IEEE 31st International Conference on Data Engineering, pp. 495–506 (2015)
Google Scholar
Jeffery, S.R., Franklin, M.J., Halevy, A.Y.: Pay-as-you-go user feedback for dataspace systems. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 847–860 (2008)
Google Scholar
Jilek, C., Schröder, M., Novik, R., Schwarz, S., Maus, H., Dengel, A.: Inflection-tolerant ontology-based named entity recognition for real-time applications. In: 2nd Conference on Language, Data and Knowledge, vol. 70. OASIcs (2019, in print)
Google Scholar
Khine, P.P., Wang, Z.S.: Data lake: a new ideology in big data era. In: ITM Web Conference, vol. 17, p. 03025 (2018)
Google Scholar
Kristjansson, T., Culotta, A.: Interactive information extraction with constrained conditional random fields. In: AAAI, vol. 4, pp. 412–418 (2004)
Google Scholar
Li, H., Zhai, J.: Constructing investment open data of Chinese listed companies based on linked data. In: Proceedings of the 17th International Digital Government Research Conference on Digital Government Research, pp. 475–480. ACM (2016)
Google Scholar
Martinez-Rodriguez, J.L., Hogan, A., Lopez-Arevalo, I.: Information extraction meets the semantic web: a survey. Semant. Web 1–81 (2018). Preprint
Google Scholar
Maus, H., Schwarz, S., Dengel, A.: Weaving personal knowledge spaces into office applications. In: Fathi, M. (ed.) Integration of Practice-Oriented Knowledge Technology: Trends and Prospectives, pp. 71–82. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-34471-8_6
Chapter Google Scholar
Olson, J.R., Rueter, H.H.: Extracting expertise from experts: methods for knowledge acquisition. Expert Syst. 4(3), 152–168 (1987)
Article Google Scholar
Pan, J.Z., Vetere, G., Gomez-Perez, J.M., Wu, H.: Exploiting Linked Data and Knowledge Graphs in Large Organisations. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-319-45654-6
Book Google Scholar
Pham, M., Alse, S., Knoblock, C.A., Szekely, P.: Semantic labeling: a domain-independent approach. In: Groth, P., et al. (eds.) ISWC 2016. LNCS, vol. 9981, pp. 446–462. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46523-4_27
Chapter Google Scholar
Rao, S.S., Nayak, A.: LinkED: a novel methodology for publishing linked enterprise data. J. Comput. Inf. Technol. 25(3), 191–209 (2017)
Article Google Scholar
Schröder, M., Hees, J., Bernardi, A., Ewert, D., Klotz, P., Stadtmüller, S.: Simplified SPARQL REST API. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 11155, pp. 40–45. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98192-5_8
Chapter Google Scholar
Schröder, M., Jilek, C., Dengel, A.: Deep linking desktop resources. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 11155, pp. 202–207. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98192-5_38
Chapter Google Scholar
Schröder, M., Jilek, C., Hees, J., Hertling, S., Dengel, A.: RDF spreadsheet editor: get (g)rid of your RDF data entry problems. In: ISWC 2017 Posters & Demonstrations and Industry Tracks, vol. 1963. CEUR (2017)
Google Scholar
Schröder, M., Jilek, C., Hees, J., Hertling, S., Dengel, A.: An easy & collaborative RDF data entry method using the spreadsheet metaphor. arXiv 1804.04175 (2018)
Google Scholar
Skluzacek, T.J., et al.: Skluma: an extensible metadata extraction pipeline for disorganized data. In: 2018 IEEE 14th International Conference on e-Science, pp. 256–266 (2018)
Google Scholar
Studer, R., Benjamins, V.R., Fensel, D., et al.: Knowledge engineering: principles and methods. Data Knowl. Eng. 25(1), 161–198 (1998)
Article Google Scholar
Terrizzano, I., Schwarz, P., Roth, M., Colino, J.E.: Data wrangling: the challenging journey from the wild to the lake. In: 7th Biennial Conference on Innovative Data Systems Research (CIDR’15) (2015)
Google Scholar
Tsuruoka, Y., Tsujii, J., Ananiadou, S.: Accelerating the annotation of sparse named entities by dynamic sentence selection. BMC Bioinf. 9, S8 (2008)
Article Google Scholar
W3C: RDF 1.1 concepts and abstract syntax (2014)
Google Scholar

Download references

Acknowledgements

Parts of this work have been funded by the German Federal Ministry of Economic Affairs and Energy in the project PRO-OPT (01MD15004D) and by the German Federal Ministry of Food and Agriculture in the project SDSD (2815708615). I thank my doctoral supervisor Prof. Dr. Andreas Dengel and my colleagues Christian Jilek, Dr. Heiko Maus, Dr. Sven Schwarz, Dr. Jörn Hees and Dr. Ansgar Bernardi for their helpful discussions, comments and feedback.

Author information

Authors and Affiliations

Smart Data & Knowledge Services Department, DFKI GmbH, Kaiserslautern, Germany
Markus Schröder
Computer Science Department, TU Kaiserslautern, Kaiserslautern, Germany
Markus Schröder

Authors

Markus Schröder
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Markus Schröder .

Editor information

Editors and Affiliations

Kansas State University, Manhattan, KS, USA
Pascal Hitzler
Vienna University of Economics and Business, Vienna, Austria
Sabrina Kirrane
Linköping University, Linköping, Sweden
Olaf Hartig
Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
Victor de Boer
Leibniz Information Centre for Science and Technology University Library (TIB), Hannover, Germany
Maria-Esther Vidal
University of Bonn, Bonn, Germany
Maria Maleshkova
Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
Stefan Schlobach
Jönköping University, Jönköping, Sweden
Karl Hammar
F. Hoffmann-La Roche AG, Basel, Switzerland
Nelia Lasierra
Robert Bosch GmbH, Stuttgart, Germany
Steffen Stadtmüller
Aalborg University, Aalborg, Denmark
Katja Hose
IMEC, Ghent University, Ghent, Belgium
Ruben Verborgh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schröder, M. (2019). Efficient High-Level Semantic Enrichment of Undocumented Enterprise Data. In: Hitzler, P., et al. The Semantic Web: ESWC 2019 Satellite Events. ESWC 2019. Lecture Notes in Computer Science(), vol 11762. Springer, Cham. https://doi.org/10.1007/978-3-030-32327-1_41

Download citation

DOI: https://doi.org/10.1007/978-3-030-32327-1_41
Published: 10 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32326-4
Online ISBN: 978-3-030-32327-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics