Skip to main content

Parallel and Distributed Data Warehouses

  • Reference work entry
  • First Online:
Encyclopedia of Database Systems

Synonyms

High performance data warehousing; Scalable decision support systems

Definition

With the era of Big Data, we are facing a data deluge (http://www.economist.com/node/15579717). Multiple data providers are contributing to this deluge. We can cite three main examples : (i) the massive use of sensors (e.g. 10 Terabyte of data are generated by planes every 30 min), (ii) the massive use of social networks (e.g., 340 million tweets per day), (iii) transactions (Walmart handles more than one million customer transactions every hour, which is imported into databases estimated to contain more than 2.5 petabytes of data). The decision makers need fast response time to their requests in order to predict in real time the behavior of users, so they can offer them services via analyzing large volumes of data. The data warehouse (\(\mathcal {DW}\)) technology deployed on conventional platforms (e.g. centralized) has become obsolete, even with the spectacular progress in terms of advanced...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Agrawal S, Narasayya VR, Yang B. Integrating vertical and horizontal partitioning into automated physical database design. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2004. p. 359–70.

    Google Scholar 

  2. Akal F, Böhm K, Schek HJ. OLAP query evaluation in a database cluster: a performance study on intra-query parallelism. In: Proceedings of the 6th East European Conference on Advances in Databases and Information Systems; 2002. p. 218–31.

    Chapter  Google Scholar 

  3. Apers PMG. Data allocation in distributed database systems. ACM Trans Database Syst. 1988;13(3): 263–304.

    Article  Google Scholar 

  4. Bellatreche L, Benkrid S, Ghazal A, Crolotte A, Cuzzocrea A. Verification of partitioning and allocation techniques on teradata DBMS. In: Proceedings of the 11th International Conference on Algorithms and Architectures for Parallel Processing; 2011. p. 158–69.

    Chapter  Google Scholar 

  5. Bellatreche L, Boukhalfa K, Richard P. Referential horizontal partitioning selection problem in data warehouses: hardness study and selection algorithms. Int J Data Warehouse Min. 2009;5(4):1–23.

    Article  Google Scholar 

  6. Bellatreche L, Cuzzocrea A, Benkrid S. Effectively and efficiently designing and querying parallel relational data warehouses on heterogeneous database clusters: the F&A approach. J Database Manag. 2012;23(4):17–51.

    Article  Google Scholar 

  7. Bellatreche L, Cuzzocrea A, Benkrid S. A global paradigm for designing parallel relational data warehouses in distributed environments. Trans Large-Scale Data-Knowl-Cent Syst J. 2014;XV:1–38 (To appear).

    Google Scholar 

  8. Bellatreche L, Karlapalem K, Mohania MK, Schneider M. What can partitioning do for your data warehouses and data marts? In: Proceedings of the International Symposium on Database Engineering and Applications; 2000. p. 437–46.

    Google Scholar 

  9. Bellatreche L, Karlapalem K, Simonet A. Algorithms and support for horizontal class partitioning in object-oriented databases. Distrib Parallel Databases J. 2000;8(2):155–79.

    Article  Google Scholar 

  10. Ceri S, Negri M, Pelagatti G. Horizontal data partitioning in database design. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1982. p. 128–36.

    Google Scholar 

  11. Dehne F, Eavis T, Rau-Chaplin A. The cgmCUBE Project: optimizing parallel data cube generation for ROLAP. J Distrib Parallel Databases. 2006;19(1): 29–62.

    Article  Google Scholar 

  12. DeWitt D, Ghandeharizadeh S, Schneider D, Bricker A, Hsaio H, Rasmussen R. The Gamma database machine project. Trans Knowl Data Eng. 1990;2(1):44–62.

    Article  Google Scholar 

  13. DeWitt D, Gray J. Parallel database systems: the future of high performance database systems. Commun ACM. 1992;35(6):85–98.

    Article  Google Scholar 

  14. Furtado C, Lima A, Pacitti E, Valduriez P, Mattoso M. Physical and virtual partitioning in OLAP database clusters. In: Proceedings of the International Symposium on Computer Architecture and High Performance Computing; 2005. p. 143–50.

    Google Scholar 

  15. Goil S, Choudhary A. High performance multidimensional analysis of large datasets. In: Proceedings of the 1st ACM International Workshop on Data Warehousing and OLAP; 1998. p. 34–9.

    Google Scholar 

  16. Jin R, Vaidyanathan K, Yang G, Agrawal G. Communication and memory optimal parallel data cube construction. Trans Parallel Distrib Syst. 2005;16(12):1105–19.

    Article  Google Scholar 

  17. Karlapalem K, Li Q. A framework for class partitioning in object-oriented databases. Distrib Parallel Databases J. 2000;8(3):333–66.

    Article  Google Scholar 

  18. Lima AB, Furtado C, Valduriez P, Mattoso M. Parallel OLAP query processing in database clusters with data replication. distributed and parallel databases. Distrib Parallel Database J. 2009;25(1–2):97–123.

    Article  Google Scholar 

  19. Noaman AY, Barker K. A horizontal fragmentation algorithm for the fact relation in a distributed data warehouse. In: Proceedings of the 8th International Conference on Information and Knowledge Management; 1999. p. 154–61.

    Google Scholar 

  20. Özsu MT, Valduriez P. Principles of distributed database systems, 2nd ed. Upper Saddle River: Prentice-Hall; 1999.

    Google Scholar 

  21. Rao J, Zhang C, Lohman G, Megiddo N. Automating physical database design in a parallel database. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2002. p. 558–69.

    Google Scholar 

  22. Roukh A, Bellatreche L, Boukorca A, Bouarar S. Eco-dmw: Eco-design methodology for data warehouses. In: Proceedings of the ACM 18th International Workshop on Data Warehousing and OLAP, DOLAP; 2015. p. 1–10.

    Google Scholar 

  23. Saccà D, Wiederhold G. Database partitioning in a cluster of processors. ACM Trans Database Syst. 1985;10(1):29–56.

    Article  MATH  Google Scholar 

  24. Scheuermann P, Weikum G, Zabback P. Data partitioning and load balancing in parallel disk systems. VLDB J. 1998;7(1):48–66.

    Article  Google Scholar 

  25. Stohr T, Märtens H, Rahm E. Multi-dimensional database allocation for parallel data warehouses. In: Proceedings of the 26th International Conference on Very Large Data Bases; 2000. p. 273–84.

    Google Scholar 

  26. Wolfson O, Jajodia S. Distributed algorithms for dynamic replication of data. In: Proceedings of the 11th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 1992. p. 149–63.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ladjel Bellatreche .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Bellatreche, L., Davis, T., Djahida, B. (2018). Parallel and Distributed Data Warehouses. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_261

Download citation

Publish with us

Policies and ethics