Data Warehouse Processing Scale-Up for Massive Concurrent Queries with SPIN

Costa, João Pedro; Furtado, Pedro

doi:10.1007/978-3-662-46335-2_1

João Pedro Costa^21,22 &
Pedro Furtado²²

Part of the book series: Lecture Notes in Computer Science ((TLDKS,volume 8970))

472 Accesses
1 Citations

Abstract

Data Warehouses (DW) store valuable information not only for strategic business decisions, but also for operational daily decisions. As a consequence, a large number of queries are concurrently submitted, stressing the database engine ability to handle such query workloads without severely degrading query response times. The query-at-time model of common database engines, where each query is independently executed and competes for the same resources, is inefficient for handling large DWs and does not provides the expected performance and scalability when processing large numbers of concurrent queries. Related work shows that there’s a performance advantage on sharing data and processing, but the proposed solutions suffer from memory limitations, reduced scalability and unpredictable execution times when applied to large DWs, particularly those with large dimensions. SPIN proposes an approach to share computation and data among concurrent queries that delivers scale-up, even in the presence of massive query workloads. In this paper we describe the mechanisms used by SPIN to embed data and queries into a shared query processing pipeline tree and how SPIN dynamically reorganizes the processing tree. We also provide experimental validation of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 16.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Candea, G., Polyzotis, N., Vingralek, R.: A scalable, predictable join operator for highly concurrent data warehouses. Proc. VLDB Endow. 2, 277–288 (2009)
Article Google Scholar
Candea, G., Polyzotis, N., Vingralek, R.: Predictable performance and high query concurrency for data analytics. VLDB J. 20(2), 227–248 (2011)
Article Google Scholar
Zukowski, M., Héman, S., Nes, N., Boncz, P.: Cooperative scans: dynamic bandwidth sharing in a DBMS. In: Proceedings of the 33rd International Conference on Very Large Data Bases, Vienna, Austria, pp. 723–734 (2007)
Google Scholar
Harizopoulos, S., Shkapenyuk, V., Ailamaki, A.: QPipe: a simultaneously pipelined relational query engine. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 383–394 (2005)
Google Scholar
Unterbrunner, P., Giannikis, G., Alonso, G., Fauser, D., Kossmann, D.: Predictable performance for unpredictable workloads. Proc. VLDB Endow. 2, 706–717 (2009)
Article Google Scholar
Arumugam, S., Dobra, A., Jermaine, C.M., Pansare, N., Perez, L.: The DataPath system: a data-centric analytic processing engine for large data warehouses. In: Proceedings of the 2010 International Conference on Management of Data, pp. 519–530 (2010)
Google Scholar
Giannikis, G., Alonso, G., Kossmann, D.: SharedDB: killing one thousand queries with one stone. Proc. VLDB Endow. 5(6), 526–537 (2012)
Article Google Scholar
Costa, J.P., Cecílio, J., Martins, P., Furtado, P.: ONE: a predictable and scalable DW model. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2011. LNCS, vol. 6862, pp. 1–13. Springer, Heidelberg (2011)
Chapter Google Scholar
Costa, J.P., Martins, P., Cecílio, J., Furtado, P.: A predictable storage model for scalable parallel DW. In: Fifteenth International Database Engineering and Applications Symposium (IDEAS 2011), Lisbon, Portugal (2011)
Google Scholar
PostgreSQL. http://www.postgresql.org/
TPC-H Decision Support Benchmark. http://www.tpc.org/tpch/

Download references

Author information

Authors and Affiliations

DEIS, ISEC, Polytechnic Institute of Coimbra, Coimbra, Portugal
João Pedro Costa
University of Coimbra, Coimbra, Portugal
João Pedro Costa & Pedro Furtado

Authors

João Pedro Costa
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Furtado
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to João Pedro Costa .

Editor information

Editors and Affiliations

IRIT, Paul Sabatier University, Toulouse, France
Abdelkader Hameurlain
FAW, University of Linz, Linz, Austria
Josef Küng
FAW, University of Linz, Linz, Austria
Roland Wagner
LIAS/ISAE-ENSMA, Chasseneuil-du-Poitou, France
Ladjel Bellatreche
IBM India Research Lab, New Delhi, India
Mukesh Mohania

Appendix A

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Costa, J.P., Furtado, P. (2015). Data Warehouse Processing Scale-Up for Massive Concurrent Queries with SPIN. In: Hameurlain, A., Küng, J., Wagner, R., Bellatreche, L., Mohania, M. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems XVII. Lecture Notes in Computer Science(), vol 8970. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46335-2_1

Download citation

DOI: https://doi.org/10.1007/978-3-662-46335-2_1
Published: 30 January 2015
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-46334-5
Online ISBN: 978-3-662-46335-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Data Warehouse Processing Scale-Up for Massive Concurrent Queries with SPIN

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix A

Appendix A

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation