Skip to main content

Dimension Table Selection Strategies to Referential Partition a Fact Table of Relational Data Warehouses

  • Chapter
  • First Online:
Recent Trends in Information Reuse and Integration

Abstract

Enterprise wide data warehouses are becoming increasingly adopted as the main source and underlying infrastructure for business intelligence (BI) solutions. Note that a data warehouse can be viewed as an integration system, where data sources are duplicated in the same repository. Data warehouses are designed to handle the queries required to discover trends and critical factors are called Online Analytical Processing (OLAP) systems. Examples of an OLAP query are: Amazon (www.amazon.com) company analyzes purchases by its customers to come up with an individual screen with products of likely interest to the customer. Analysts at Wal-Mart (www.walmart.com) look for items with increasing sales in some city. Star schemes or their variants are usually used to model warehouse applications. They are composed of thousand of dimension tables and multiple fact tables [15, 18]. Figure 2.1 shows an example of star schema of the widely-known data warehouse benchmark APB-1 release II [21]. Here, the fact table Sales is joint to the following four dimension tables: Product, Customer, Time, Channel. Star queries are typically executed against the warehouse. Queries running on such applications contain a large number of costly joins, selections and aggregations. They are called mega queries [24]. To optimize these queries, the use of advanced optimization techniques is necessary.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    These queries are described in Appendix Section at: http://www.lisi.ensma.fr/ftp/pub/documents/papers/2009/2009-DOLAP-Bellatreche.pdf

  2. 2.

    For example, we can partition a table using three fragmentation attributes.

  3. 3.

    The Oracle cost-based optimizer recognizes star queries.

References

  1. Bellatreche, L., Boukhalfa, K., Abdalla, H.I.: SAGA: A Combination of Genetic and Simulated Annealing Algorithms for Physical Data Warehouse Design. In: Proceedings of BNCOD’06, pp. 212–219 (2006)

    Google Scholar 

  2. Bellatreche, L., Boukhalfa, K., Richard, P.: Data Partitioning in Data Warehouses: Hardness Study, Heuristics and ORACLE Validation. In: Proceedings of DaWaK’2008, pp. 87–96 (2008)

    Google Scholar 

  3. Bellatreche, L., Boukhalfa, K., Richard, P., Woameno, K.Y.: Referential Horizontal Partitioning Selection Problem in Data Warehouses: Hardness Study and Selection Algorithms. In IJDWM. 5(4), 1–23 (2009)

    Google Scholar 

  4. Bellatreche, L., Karlapalem, K., Simonet. A.: Algorithms and Support for Horizontal Class Partitioning in Object-Oriented Databases. In the Distributed and Parallel Databases Journal, 8(2), 155–179 (2000)

    Google Scholar 

  5. Bellatreche, L., Woameno, K.Y.: Dimension Table Driven Approach to Referential Partition Relational Data Warehouses. In: ACM 12th International Workshop on Data Warehousing and OLAP (DOLAP), pp. 9–16 (2009)

    Google Scholar 

  6. Boukhalfa, K.: De la Conception Physique aux Outils d’Administration et de Tuning des Entrepts de Donnes. Poitiers University, France, PhD. Thesis. (2009)

    Google Scholar 

  7. Ceri, S., Negri, M., Pelagatti, G.: Horizontal Data Partitioning in Database Design. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. SIGPLAN Notices, pp. 128–136 (1982)

    Google Scholar 

  8. Cho, W.S., Park, C.M., Whang, K.Y., So, S.H.: A New Method for estimating the number of objects satisfying an object-oriented query involving partial participation of classes. Inf. Syst. 21(3), 253–267 (1996)

    Article  Google Scholar 

  9. Eadon, G., Chong, E.I., Shankar, S., Raghavan, A., Srinivasan, J., Das, S.: Supporting Table Partitioning By Reference in Oracle. In: Proceedings of SIGMOD’08, pp. 1111–1122 (2008)

    Google Scholar 

  10. Furtado, P.: Experimental evidence on partitioning in parallel data warehouses. In: Proceedings Of DOLAP, pp. 23–30 (2004)

    Google Scholar 

  11. Gibbons, P.B., Matias, Y., Poosala, V.: Fast incremental maintenance of approximate histograms. ACM Trans. Database Syst. 27(3), 261–298 (2002)

    Article  Google Scholar 

  12. Gray, J., Slutz, D.: Data Mining the SDSS SkyServer Database. Microsoft Research, Technical Report MSR-TR-2002-01 (2002)

    Google Scholar 

  13. Karlapalem, K., Navathe, S.B., Ammar, M.: Optimal Redesign Policies to Support Dynamic Processing of Applications on a Distributed Database System. Information Systems, 21(4), 353–367 (1996)

    Article  Google Scholar 

  14. Lei, H., Ross, K.A.: Faster Joins, Self-Joins and Multi-Way Joins Using Join Indices. In Data and Knowledge Engineering, 28(3), 277–298 (1998)

    MATH  Google Scholar 

  15. Legler, T., Lehner, W., Ross, A.: Query Optimization For Data Warehouse System With Different Data Distribution Strategies, In BTW, pp. 502–513 (2007)

    Google Scholar 

  16. Mahboubi, H., Darmont, J.: Data mining-based fragmentation of XML data warehouses. In: Proceedings DOLAP’08, pp. 9–16 (2008)

    Google Scholar 

  17. Munneke, D., Wahlstrom, K., Mohania, M.K.: Fragmentation of Multidimensional Databases. In: Proceedings of ADC’99 pp. 153–164 (1999)

    Google Scholar 

  18. Neumann, T.: Query simplification: graceful degradation for join-order optimization. In: Proceedings of SIGMOD’09, pp. 403–414 (2009)

    Google Scholar 

  19. Noaman, A.Y., Barker, K.: A Horizontal Fragmentation Algorithm for the Fact Relation in a Distributed Data Warehouse. In: Proceedings of CIKM’99, pp. 154–161 (1999)

    Google Scholar 

  20. Oracle Data Sheet: Oracle Partitioning (2007) White Paper: http://www.oracle.com/technology/products/bi/db/11g

  21. OLAP Council: APB-1 OLAP Benchmark, Release II. http://www.olapcouncil.org/research/bmarkly.htm (1998)

  22. Özsu, M.T., Valduriez, P.: Principles of Distributed Database Systems, Second Ed. Prentice Hall (1999)

    Google Scholar 

  23. Sanjay, A., Narasayya, V.R., Yang, B.: Integrating Vertical and Horizontal Partitioning Into Automated Physical Database Design. In: Proceedings of SIGMOD’04, pp. 359–370 (2004)

    Google Scholar 

  24. Simon, E.: Reality check: a case study of an EII research prototype encountering customer needs. In Proceedings of EDBT’08, pp. 1 (2008)

    Google Scholar 

  25. Stöhr, T., Märtens, H., Rahm, E.: Multi-Dimensional Database Allocation for Parallel Data Warehouses. In: Proceedings of VLDB2000, pp. 273–284 (2000)

    Google Scholar 

  26. Steinbrunn, M., Moerkotte, G., Kemper, A.: Heuristic and Randomized Optimization for the Join Ordering Problem. In VLDB Journal. 6(3), 191–208 (1997)

    Article  Google Scholar 

  27. Swami, A.N., Schiefer, K.B.: On the Estimation of Join Result Sizes. In: Proceedings of EDBT’04, pp. 287–300 (1994)

    Google Scholar 

  28. Sybase: Sybase Adaptive Server Enterprise 15 Data Partitioning. White paper (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ladjel Bellatreche .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Vienna

About this chapter

Cite this chapter

Bellatreche, L. (2012). Dimension Table Selection Strategies to Referential Partition a Fact Table of Relational Data Warehouses. In: Özyer, T., Kianmehr, K., Tan, M. (eds) Recent Trends in Information Reuse and Integration. Springer, Vienna. https://doi.org/10.1007/978-3-7091-0738-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-7091-0738-6_2

  • Published:

  • Publisher Name: Springer, Vienna

  • Print ISBN: 978-3-7091-0737-9

  • Online ISBN: 978-3-7091-0738-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics