Reference Architecture for Running Large Scale Data Integration Experiments

Bodziony, Michał; Wrembel, Robert

doi:10.1007/978-3-030-86472-9_1

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12923))

Included in the following conference series:

International Conference on Database and Expert Systems Applications

1221 Accesses
1 Citations

Abstract

This paper contributes a reference architecture of a reusable infrastructure for scientific experiments on data processing and data integration. The architecture is based on containerization and is integrated with an external machine learning cloud service to build performance models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Auer, F., Ros, R., Kaltenbrunner, L., Runeson, P., Felderer, M.: Controlled experimentation in continuous experimentation: knowledge and challenges. Inf. Softw. Technol. 134, 106551 (2021)
Google Scholar
Deng, Q., Goudarzi, M., Buyya, R.: Fogbus2: a lightweight and distributed container-based framework for integration of IoT-enabled systems with edge and cloud computing. In: International Workshop on Big Data in Emergent Distributed Environments (BiDEDE) @SIGMOD. ACM (2021)
Google Scholar
Hernández, Á.B., Pérez, M.S., Gupta, S., Muntés-Mulero, V.: Using machine learning to optimize parallelism in big data applications. Future Gener. Comput. Syst. 86, 1076–1092 (2018)
Google Scholar
Herodotou, H., et al.: Starfish: a self-tuning system for big data analytics. In: CIDR, pp. 261–272 (2011)
Google Scholar
Majithia, S., Walker, D.W., Gray, W.A.: Automating scientific experiments on the semantic grid. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 365–379. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30475-3_26
Popescu, A.D., Ercegovac, V., Balmin, A., Branco, M., Ailamaki, A.: Same queries, different data: can we predict runtime performance? In: ICDE Workshops, pp. 275–280 (2012)
Google Scholar
Vrijders, S., Staessens, D., Capitani, M., Maffione, V.: Rumba: a Python framework for automating large-scale recursive internet experiments on GENI and FIRE+. In: IEEE Conference on Computer Communications Workshops, pp. 324–329 (2018)
Google Scholar
Wroe, C., et al.: Automating experiments using semantic data on a bioinformatics grid. IEEE Intell. Syst. 19(1), 48–55 (2004)
Google Scholar

Download references

Acknowledgement

The work of Robert Wrembel is partially supported by IBM Shared University Reward 2019.

Author information

Authors and Affiliations

IBM Poland, Software Lab Kraków, Kraków, Poland
Michał Bodziony
Poznan University of Technology, Poznań, Poland
Robert Wrembel

Authors

Michał Bodziony
View author publications
You can also search for this author in PubMed Google Scholar
Robert Wrembel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Robert Wrembel .

Editor information

Editors and Affiliations

University of Vienna, Vienna, Austria
Christine Strauss
Johannes Kepler University of Linz, Linz, Oberösterreich, Austria
Gabriele Kotsis
Vienna University of Technology, Vienna, Austria
A Min Tjoa
Johannes Kepler University of Linz, Linz, Austria
Ismail Khalil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bodziony, M., Wrembel, R. (2021). Reference Architecture for Running Large Scale Data Integration Experiments. In: Strauss, C., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2021. Lecture Notes in Computer Science(), vol 12923. Springer, Cham. https://doi.org/10.1007/978-3-030-86472-9_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-86472-9_1
Published: 31 August 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86471-2
Online ISBN: 978-3-030-86472-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics