Abstract
With the emergence of big data technologies, the problem of structure evolution of integrated heterogeneous data sources has become extremely topical due to dynamic and diverse nature of big data. To solve the big data evolution problem, we propose an architecture that allows to store and process structured and unstructured data at different levels of detail, analyze them using OLAP capabilities and semi-automatically manage changes in requirements and data expansion. In this paper, we concentrate on the metadata essential for the operation of the proposed architecture. We propose a metadata model to describe schemata and supplementary properties of data sets extracted from sources and transformed to obtain integrated data for the analysis in a flexible way. Furthermore, the unique feature of the proposed model is that it allows to keep track of all changes that occur in the system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ceravolo, P., et al.: Big data semantics. J. Data Semant. 7(2), 65–85 (2018)
Kaisler, S., Armour, F., Espinosa, J.A., Money, W.: Big data: issues and challenges moving forward. In: Proceedings of 46th Hawaii International Conference on System Sciences, pp. 995–1004 (2013)
Terrizzano, I.G., Schwarz, P.M., Roth, M., Colino, J.E.: Data wrangling: the challenging Yourney from the wild to the lake. In: Proceedings of 7th Biennial Conference on Innovative Data Systems Research (CIDR 2015), Asilomar, CA, USA (2015)
Bilalli, B., Abelló, A., Aluja, T., Wrembel, R.: Towards intelligent data analysis: the metadata challenge. In: Proceedings of the International Conference on Internet of Things and Big Data - Volume 1, IoTBD, pp. 331–338, Rome, Italy (2016)
Diamantini, C., Lo Giudice, P., Musarella, L., Potena, D., Storti, E., Ursino, D.: A new metadata model to uniformly handle heterogeneous data lake sources. In: New Trends in Databases and Information Systems, ADBIS 2018 Short Papers and Workshops, Budapest, Hungary, pp. 165–177 (2018)
Oram, A.: Managing the Data Lake. O’Reilly, Sebastopol (2015)
Quix, C., Hai, R., Vatov, I.: Metadata extraction and management in data lakes with GEMMS. Complex Syst. Inform. Model. Q. 9, 67–83 (2016)
Solodovnikova, D., Niedrite, L.: Towards a data warehouse architecture for managing big data evolution. In: Proceedings of the 7th International Conference on Data Science, Technology and Applications (DATA 2018), Porto, Portugal, pp. 63–70 (2018)
Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd edn. Wiley, Hoboken (2013)
Acknowledgments
This work has been supported by the European Regional Development Fund (ERDF) project No. 1.1.1.2./VIAA/1/16/057.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Solodovnikova, D., Niedrite, L., Niedritis, A. (2019). On Metadata Support for Integrating Evolving Heterogeneous Data Sources. In: Welzer, T., et al. New Trends in Databases and Information Systems. ADBIS 2019. Communications in Computer and Information Science, vol 1064. Springer, Cham. https://doi.org/10.1007/978-3-030-30278-8_38
Download citation
DOI: https://doi.org/10.1007/978-3-030-30278-8_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30277-1
Online ISBN: 978-3-030-30278-8
eBook Packages: Computer ScienceComputer Science (R0)