Designing Parallel Relational Data Warehouses: A Global, Comprehensive Approach

  • Soumia BenkridEmail author
  • Ladjel Bellatreche
  • Alfredo Cuzzocrea
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 241)


The process of designing a parallel data warehouse has two main steps: (1) fragmentation and (2) allocation of so-generated fragments at various nodes. Usually, we split the data warehouse horizontally, allocate fragments over nodes, and finally balance the load over the nodes of the parallel machine. The main drawback of such design approach is that the high communication cost. Therefore, Data Replication (DR) has become a requirement for availability on the one hand but also for minimizing the communication cost on the other hand. In this paper, we present a redundant allocation algorithm for designing shared-nothing parallel relational data warehouses, which is based on the well-known fuzzy k-means clustering algorithm.


Fuzzy Cluster Data Warehouse Membership Degree Data Replication Data Allocation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agrawal, D., Das, S., El Abbadi, A.: Data Management in the Cloud: Challenges and Opportunities. Synthesis Lectures on Data Management. Morgan & Claypool Publishers (2012)Google Scholar
  2. 2.
    Ahmad, I., Karlapalem, K., Ghafoor, R.A.: Evolutionary algorithms for allocating data in distributed database systems. In: Distributed Database Systems, Distributed and Parallel Databases, pp. 5–32 (2002)Google Scholar
  3. 3.
    Akal, F., Böhm, K., Schek, H.-J.: OLAP query evaluation in a database cluster: A performance study on intra-query parallelism. In: Manolopoulos, Y., Návrat, P. (eds.) ADBIS 2002. LNCS, vol. 2435, pp. 218–231. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  4. 4.
    Apers, P.M.G.: Data allocation in distributed database systems. ACM Transactions on Database Systems 13(3), 263–304 (1988)CrossRefGoogle Scholar
  5. 5.
    Bellatreche, L., Benkrid, S.: A joint design approach of partitioning and allocation in parallel data warehouses. In: Pedersen, T.B., Mohania, M.K., Tjoa, A.M. (eds.) DaWaK 2009. LNCS, vol. 5691, pp. 99–110. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  6. 6.
    Bellatreche, L., Benkrid, S., Crolotte, A., Cuzzocrea, A., Ghazal, A.: The f&a methodology and its experimental validation on a real-life parallel processing database system. In: CISIS 2012, pp. 114–121 (2012)Google Scholar
  7. 7.
    Bellatreche, L., Cuzzocrea, A., Benkrid, S.: \(\mathcal{F}\)&\(\mathcal{A}\): A methodology for effectively and efficiently designing parallel relational data warehouses on heterogenous database clusters. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds.) DAWAK 2010. LNCS, vol. 6263, pp. 89–104. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  8. 8.
    Bergsten, B., Couprie, M., Valduriez, P.: Overview of parallel architectures for databases. Comput. J. 36(8), 734–740 (1993)CrossRefGoogle Scholar
  9. 9.
    Bezdek, J.C., Ehrlich, R., Full, W.: Fcm: The fuzzy c-means clustering algorithm. Computers and Geo-sciences 10(2-3), 191–203 (1984)CrossRefGoogle Scholar
  10. 10.
    Ciciani, B., Dias, D.M., Yu, P.S.: Analysis of replication in distributed database systems. IEEE Trans. on Knowl. and Data Eng., 247–261 (1990)Google Scholar
  11. 11.
    Cuzzocrea, A.: Theoretical and practical aspects of warehousing, querying and mining sensor and streaming data. Journal of Computer and System Science 79(3), 309–311 (2013)MathSciNetCrossRefGoogle Scholar
  12. 12.
    DeWitt, D., Madden, S., Stonebraker, M.: How to build a high-performance data warehouse,
  13. 13.
    Hsiao, H.I., Dewitt, D.J.: Chained declustering: A new availability strategy for multiprocssor database machines. In: ICDE 1990, pp. 456–465 (1990)Google Scholar
  14. 14.
    Coffman Jr., E.G., Leung, Joseph, Y.-T., Ting, D.W.: Bin packing: Maximizing the number of pieces packed 9, 263–271 (1978)Google Scholar
  15. 15.
    Lima, A.A.B., Mattoso, M., Valduriez, P.: Adaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster. In: Lifschitz, S. (ed.) SBBD 2004, Brasilia, Brésil, pp. 92–105 (2004)Google Scholar
  16. 16.
    Lima, A.B., Furtado, C., Valduriez, P., Mattoso, M.: Parallel olap query processing in database clusters with data replication. distributed and parallel databases. Distributed and Parallel Database Journal 25(1-2), 97–123 (2009)CrossRefGoogle Scholar
  17. 17.
    Loukopoulos, T., Ahmad, I.: Static and adaptive distributed data replication using genetic algorithms. Journal of Parallel and Distributed Computing 64(11), 1270–1285 (2004)zbMATHCrossRefGoogle Scholar
  18. 18.
    Menon, S.: Allocating fragments in distributed databases. IEEE Transactions on Parallel and Distributed Systems 16(7), 577–585 (2005)CrossRefGoogle Scholar
  19. 19.
    Nehme, R.V., Bruno, N.: Automated partitioning design in parallel database systems. In: ACM SIGMOD 2011, pp. 1137–1148 (2011)Google Scholar
  20. 20.
    Pavlo, A., Curino, C., Zdonik, S.: Skew-aware automatic database partitioning in shared-nothing, parallel oltp systems. In: ACM SIGMOD 2012, pp. 61–72. ACM, New York (2012)Google Scholar
  21. 21.
    Rao, J., Zhang, C., Lohman, G., Megiddo, N.: Automating physical database design in a parallel database. In: ACM SIGMOD 2002, pp. 558–569 (June 2002)Google Scholar
  22. 22.
    Stöhr, T., Märtens, H., Rahm, E.: Multi-dimensional database allocation for parallel data warehouses. In: VLDB 2000, pp. 273–284 (2000)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Soumia Benkrid
    • 1
    • 2
    Email author
  • Ladjel Bellatreche
    • 1
  • Alfredo Cuzzocrea
    • 3
  1. 1.LIAS/ISAE-ENSMAPoitiersFrance
  2. 2.National High School for Computer Science (ESI)AlgiersAlgeria
  3. 3.ICAR-CNR and University of CalabriaRendeItaly

Personalised recommendations