Abstract
This paper contributes a reference architecture of a reusable infrastructure for scientific experiments on data processing and data integration. The architecture is based on containerization and is integrated with an external machine learning cloud service to build performance models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Auer, F., Ros, R., Kaltenbrunner, L., Runeson, P., Felderer, M.: Controlled experimentation in continuous experimentation: knowledge and challenges. Inf. Softw. Technol. 134, 106551 (2021)
Deng, Q., Goudarzi, M., Buyya, R.: Fogbus2: a lightweight and distributed container-based framework for integration of IoT-enabled systems with edge and cloud computing. In: International Workshop on Big Data in Emergent Distributed Environments (BiDEDE) @SIGMOD. ACM (2021)
Hernández, Á.B., Pérez, M.S., Gupta, S., Muntés-Mulero, V.: Using machine learning to optimize parallelism in big data applications. Future Gener. Comput. Syst. 86, 1076–1092 (2018)
Herodotou, H., et al.: Starfish: a self-tuning system for big data analytics. In: CIDR, pp. 261–272 (2011)
Majithia, S., Walker, D.W., Gray, W.A.: Automating scientific experiments on the semantic grid. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 365–379. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30475-3_26
Popescu, A.D., Ercegovac, V., Balmin, A., Branco, M., Ailamaki, A.: Same queries, different data: can we predict runtime performance? In: ICDE Workshops, pp. 275–280 (2012)
Vrijders, S., Staessens, D., Capitani, M., Maffione, V.: Rumba: a Python framework for automating large-scale recursive internet experiments on GENI and FIRE+. In: IEEE Conference on Computer Communications Workshops, pp. 324–329 (2018)
Wroe, C., et al.: Automating experiments using semantic data on a bioinformatics grid. IEEE Intell. Syst. 19(1), 48–55 (2004)
Acknowledgement
The work of Robert Wrembel is partially supported by IBM Shared University Reward 2019.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Bodziony, M., Wrembel, R. (2021). Reference Architecture for Running Large Scale Data Integration Experiments. In: Strauss, C., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2021. Lecture Notes in Computer Science(), vol 12923. Springer, Cham. https://doi.org/10.1007/978-3-030-86472-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-86472-9_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86471-2
Online ISBN: 978-3-030-86472-9
eBook Packages: Computer ScienceComputer Science (R0)