Advertisement

Dimension Table Selection Strategies to Referential Partition a Fact Table of Relational Data Warehouses

  • Ladjel BellatrecheEmail author
Chapter

Abstract

Enterprise wide data warehouses are becoming increasingly adopted as the main source and underlying infrastructure for business intelligence (BI) solutions. Note that a data warehouse can be viewed as an integration system, where data sources are duplicated in the same repository. Data warehouses are designed to handle the queries required to discover trends and critical factors are called Online Analytical Processing (OLAP) systems. Examples of an OLAP query are: Amazon (www.amazon.com) company analyzes purchases by its customers to come up with an individual screen with products of likely interest to the customer. Analysts at Wal-Mart (www.walmart.com) look for items with increasing sales in some city. Star schemes or their variants are usually used to model warehouse applications. They are composed of thousand of dimension tables and multiple fact tables [15, 18]. Figure 2.1 shows an example of star schema of the widely-known data warehouse benchmark APB-1 release II [21]. Here, the fact table Sales is joint to the following four dimension tables: Product, Customer, Time, Channel. Star queries are typically executed against the warehouse. Queries running on such applications contain a large number of costly joins, selections and aggregations. They are called mega queries [24]. To optimize these queries, the use of advanced optimization techniques is necessary.

Keywords

Dimension Tables Fact Table Data Warehouse Referential Partitioning OLAP Queries 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Bellatreche, L., Boukhalfa, K., Abdalla, H.I.: SAGA: A Combination of Genetic and Simulated Annealing Algorithms for Physical Data Warehouse Design. In: Proceedings of BNCOD’06, pp. 212–219 (2006)Google Scholar
  2. 2.
    Bellatreche, L., Boukhalfa, K., Richard, P.: Data Partitioning in Data Warehouses: Hardness Study, Heuristics and ORACLE Validation. In: Proceedings of DaWaK’2008, pp. 87–96 (2008)Google Scholar
  3. 3.
    Bellatreche, L., Boukhalfa, K., Richard, P., Woameno, K.Y.: Referential Horizontal Partitioning Selection Problem in Data Warehouses: Hardness Study and Selection Algorithms. In IJDWM. 5(4), 1–23 (2009)Google Scholar
  4. 4.
    Bellatreche, L., Karlapalem, K., Simonet. A.: Algorithms and Support for Horizontal Class Partitioning in Object-Oriented Databases. In the Distributed and Parallel Databases Journal, 8(2), 155–179 (2000)Google Scholar
  5. 5.
    Bellatreche, L., Woameno, K.Y.: Dimension Table Driven Approach to Referential Partition Relational Data Warehouses. In: ACM 12th International Workshop on Data Warehousing and OLAP (DOLAP), pp. 9–16 (2009)Google Scholar
  6. 6.
    Boukhalfa, K.: De la Conception Physique aux Outils d’Administration et de Tuning des Entrepts de Donnes. Poitiers University, France, PhD. Thesis. (2009)Google Scholar
  7. 7.
    Ceri, S., Negri, M., Pelagatti, G.: Horizontal Data Partitioning in Database Design. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. SIGPLAN Notices, pp. 128–136 (1982)Google Scholar
  8. 8.
    Cho, W.S., Park, C.M., Whang, K.Y., So, S.H.: A New Method for estimating the number of objects satisfying an object-oriented query involving partial participation of classes. Inf. Syst. 21(3), 253–267 (1996)CrossRefGoogle Scholar
  9. 9.
    Eadon, G., Chong, E.I., Shankar, S., Raghavan, A., Srinivasan, J., Das, S.: Supporting Table Partitioning By Reference in Oracle. In: Proceedings of SIGMOD’08, pp. 1111–1122 (2008)Google Scholar
  10. 10.
    Furtado, P.: Experimental evidence on partitioning in parallel data warehouses. In: Proceedings Of DOLAP, pp. 23–30 (2004)Google Scholar
  11. 11.
    Gibbons, P.B., Matias, Y., Poosala, V.: Fast incremental maintenance of approximate histograms. ACM Trans. Database Syst. 27(3), 261–298 (2002)CrossRefGoogle Scholar
  12. 12.
    Gray, J., Slutz, D.: Data Mining the SDSS SkyServer Database. Microsoft Research, Technical Report MSR-TR-2002-01 (2002)Google Scholar
  13. 13.
    Karlapalem, K., Navathe, S.B., Ammar, M.: Optimal Redesign Policies to Support Dynamic Processing of Applications on a Distributed Database System. Information Systems, 21(4), 353–367 (1996)CrossRefGoogle Scholar
  14. 14.
    Lei, H., Ross, K.A.: Faster Joins, Self-Joins and Multi-Way Joins Using Join Indices. In Data and Knowledge Engineering, 28(3), 277–298 (1998)zbMATHGoogle Scholar
  15. 15.
    Legler, T., Lehner, W., Ross, A.: Query Optimization For Data Warehouse System With Different Data Distribution Strategies, In BTW, pp. 502–513 (2007)Google Scholar
  16. 16.
    Mahboubi, H., Darmont, J.: Data mining-based fragmentation of XML data warehouses. In: Proceedings DOLAP’08, pp. 9–16 (2008)Google Scholar
  17. 17.
    Munneke, D., Wahlstrom, K., Mohania, M.K.: Fragmentation of Multidimensional Databases. In: Proceedings of ADC’99 pp. 153–164 (1999)Google Scholar
  18. 18.
    Neumann, T.: Query simplification: graceful degradation for join-order optimization. In: Proceedings of SIGMOD’09, pp. 403–414 (2009)Google Scholar
  19. 19.
    Noaman, A.Y., Barker, K.: A Horizontal Fragmentation Algorithm for the Fact Relation in a Distributed Data Warehouse. In: Proceedings of CIKM’99, pp. 154–161 (1999)Google Scholar
  20. 20.
    Oracle Data Sheet: Oracle Partitioning (2007) White Paper: http://www.oracle.com/technology/products/bi/db/11g
  21. 21.
    OLAP Council: APB-1 OLAP Benchmark, Release II. http://www.olapcouncil.org/research/bmarkly.htm (1998)
  22. 22.
    Özsu, M.T., Valduriez, P.: Principles of Distributed Database Systems, Second Ed. Prentice Hall (1999)Google Scholar
  23. 23.
    Sanjay, A., Narasayya, V.R., Yang, B.: Integrating Vertical and Horizontal Partitioning Into Automated Physical Database Design. In: Proceedings of SIGMOD’04, pp. 359–370 (2004)Google Scholar
  24. 24.
    Simon, E.: Reality check: a case study of an EII research prototype encountering customer needs. In Proceedings of EDBT’08, pp. 1 (2008)Google Scholar
  25. 25.
    Stöhr, T., Märtens, H., Rahm, E.: Multi-Dimensional Database Allocation for Parallel Data Warehouses. In: Proceedings of VLDB2000, pp. 273–284 (2000)Google Scholar
  26. 26.
    Steinbrunn, M., Moerkotte, G., Kemper, A.: Heuristic and Randomized Optimization for the Join Ordering Problem. In VLDB Journal. 6(3), 191–208 (1997)CrossRefGoogle Scholar
  27. 27.
    Swami, A.N., Schiefer, K.B.: On the Estimation of Join Result Sizes. In: Proceedings of EDBT’04, pp. 287–300 (1994)Google Scholar
  28. 28.
    Sybase: Sybase Adaptive Server Enterprise 15 Data Partitioning. White paper (2005)Google Scholar

Copyright information

© Springer Vienna 2012

Authors and Affiliations

  1. 1.LISI/ENSMA – Poitiers UniversityFuturoscopeFrance

Personalised recommendations