Dimension Table Selection Strategies to Referential Partition a Fact Table of Relational Data Warehouses

Bellatreche, Ladjel

doi:10.1007/978-3-7091-0738-6_2

Ladjel Bellatreche⁴

501 Accesses
6 Citations

Abstract

Enterprise wide data warehouses are becoming increasingly adopted as the main source and underlying infrastructure for business intelligence (BI) solutions. Note that a data warehouse can be viewed as an integration system, where data sources are duplicated in the same repository. Data warehouses are designed to handle the queries required to discover trends and critical factors are called Online Analytical Processing (OLAP) systems. Examples of an OLAP query are: Amazon (www.amazon.com) company analyzes purchases by its customers to come up with an individual screen with products of likely interest to the customer. Analysts at Wal-Mart (www.walmart.com) look for items with increasing sales in some city. Star schemes or their variants are usually used to model warehouse applications. They are composed of thousand of dimension tables and multiple fact tables [15, 18]. Figure 2.1 shows an example of star schema of the widely-known data warehouse benchmark APB-1 release II [21]. Here, the fact table Sales is joint to the following four dimension tables: Product, Customer, Time, Channel. Star queries are typically executed against the warehouse. Queries running on such applications contain a large number of costly joins, selections and aggregations. They are called mega queries [24]. To optimize these queries, the use of advanced optimization techniques is necessary.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
These queries are described in Appendix Section at: http://www.lisi.ensma.fr/ftp/pub/documents/papers/2009/2009-DOLAP-Bellatreche.pdf
2.
For example, we can partition a table using three fragmentation attributes.
3.
The Oracle cost-based optimizer recognizes star queries.

References

Bellatreche, L., Boukhalfa, K., Abdalla, H.I.: SAGA: A Combination of Genetic and Simulated Annealing Algorithms for Physical Data Warehouse Design. In: Proceedings of BNCOD’06, pp. 212–219 (2006)
Google Scholar
Bellatreche, L., Boukhalfa, K., Richard, P.: Data Partitioning in Data Warehouses: Hardness Study, Heuristics and ORACLE Validation. In: Proceedings of DaWaK’2008, pp. 87–96 (2008)
Google Scholar
Bellatreche, L., Boukhalfa, K., Richard, P., Woameno, K.Y.: Referential Horizontal Partitioning Selection Problem in Data Warehouses: Hardness Study and Selection Algorithms. In IJDWM. 5(4), 1–23 (2009)
Google Scholar
Bellatreche, L., Karlapalem, K., Simonet. A.: Algorithms and Support for Horizontal Class Partitioning in Object-Oriented Databases. In the Distributed and Parallel Databases Journal, 8(2), 155–179 (2000)
Google Scholar
Bellatreche, L., Woameno, K.Y.: Dimension Table Driven Approach to Referential Partition Relational Data Warehouses. In: ACM 12th International Workshop on Data Warehousing and OLAP (DOLAP), pp. 9–16 (2009)
Google Scholar
Boukhalfa, K.: De la Conception Physique aux Outils d’Administration et de Tuning des Entrepts de Donnes. Poitiers University, France, PhD. Thesis. (2009)
Google Scholar
Ceri, S., Negri, M., Pelagatti, G.: Horizontal Data Partitioning in Database Design. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. SIGPLAN Notices, pp. 128–136 (1982)
Google Scholar
Cho, W.S., Park, C.M., Whang, K.Y., So, S.H.: A New Method for estimating the number of objects satisfying an object-oriented query involving partial participation of classes. Inf. Syst. 21(3), 253–267 (1996)
Article Google Scholar
Eadon, G., Chong, E.I., Shankar, S., Raghavan, A., Srinivasan, J., Das, S.: Supporting Table Partitioning By Reference in Oracle. In: Proceedings of SIGMOD’08, pp. 1111–1122 (2008)
Google Scholar
Furtado, P.: Experimental evidence on partitioning in parallel data warehouses. In: Proceedings Of DOLAP, pp. 23–30 (2004)
Google Scholar
Gibbons, P.B., Matias, Y., Poosala, V.: Fast incremental maintenance of approximate histograms. ACM Trans. Database Syst. 27(3), 261–298 (2002)
Article Google Scholar
Gray, J., Slutz, D.: Data Mining the SDSS SkyServer Database. Microsoft Research, Technical Report MSR-TR-2002-01 (2002)
Google Scholar
Karlapalem, K., Navathe, S.B., Ammar, M.: Optimal Redesign Policies to Support Dynamic Processing of Applications on a Distributed Database System. Information Systems, 21(4), 353–367 (1996)
Article Google Scholar
Lei, H., Ross, K.A.: Faster Joins, Self-Joins and Multi-Way Joins Using Join Indices. In Data and Knowledge Engineering, 28(3), 277–298 (1998)
MATH Google Scholar
Legler, T., Lehner, W., Ross, A.: Query Optimization For Data Warehouse System With Different Data Distribution Strategies, In BTW, pp. 502–513 (2007)
Google Scholar
Mahboubi, H., Darmont, J.: Data mining-based fragmentation of XML data warehouses. In: Proceedings DOLAP’08, pp. 9–16 (2008)
Google Scholar
Munneke, D., Wahlstrom, K., Mohania, M.K.: Fragmentation of Multidimensional Databases. In: Proceedings of ADC’99 pp. 153–164 (1999)
Google Scholar
Neumann, T.: Query simplification: graceful degradation for join-order optimization. In: Proceedings of SIGMOD’09, pp. 403–414 (2009)
Google Scholar
Noaman, A.Y., Barker, K.: A Horizontal Fragmentation Algorithm for the Fact Relation in a Distributed Data Warehouse. In: Proceedings of CIKM’99, pp. 154–161 (1999)
Google Scholar
Oracle Data Sheet: Oracle Partitioning (2007) White Paper: http://www.oracle.com/technology/products/bi/db/11g
OLAP Council: APB-1 OLAP Benchmark, Release II. http://www.olapcouncil.org/research/bmarkly.htm (1998)
Özsu, M.T., Valduriez, P.: Principles of Distributed Database Systems, Second Ed. Prentice Hall (1999)
Google Scholar
Sanjay, A., Narasayya, V.R., Yang, B.: Integrating Vertical and Horizontal Partitioning Into Automated Physical Database Design. In: Proceedings of SIGMOD’04, pp. 359–370 (2004)
Google Scholar
Simon, E.: Reality check: a case study of an EII research prototype encountering customer needs. In Proceedings of EDBT’08, pp. 1 (2008)
Google Scholar
Stöhr, T., Märtens, H., Rahm, E.: Multi-Dimensional Database Allocation for Parallel Data Warehouses. In: Proceedings of VLDB2000, pp. 273–284 (2000)
Google Scholar
Steinbrunn, M., Moerkotte, G., Kemper, A.: Heuristic and Randomized Optimization for the Join Ordering Problem. In VLDB Journal. 6(3), 191–208 (1997)
Article Google Scholar
Swami, A.N., Schiefer, K.B.: On the Estimation of Join Result Sizes. In: Proceedings of EDBT’04, pp. 287–300 (1994)
Google Scholar
Sybase: Sybase Adaptive Server Enterprise 15 Data Partitioning. White paper (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

LISI/ENSMA – Poitiers University, 1, avenue Clément Ader, 86960, Futuroscope, France
Ladjel Bellatreche

Authors

Ladjel Bellatreche
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ladjel Bellatreche .

Editor information

Editors and Affiliations

, Department of Computer Engineering, Tobb University, Sögütözü Caddesi 43, Ankara, 06560, Turkey
Tansel Özyer
, Department of Electrical Engineering, University of Western Ontario, Building 363, London, N6A 5B9, Ontario, Canada
Keivan Kianmehr
, Department of Computer Engineering, Tobb University, Söğütözü Caddesi 43, Ankara, 06560, Ankara, Turkey
Mehmet Tan

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bellatreche, L. (2012). Dimension Table Selection Strategies to Referential Partition a Fact Table of Relational Data Warehouses. In: Özyer, T., Kianmehr, K., Tan, M. (eds) Recent Trends in Information Reuse and Integration. Springer, Vienna. https://doi.org/10.1007/978-3-7091-0738-6_2

Download citation

DOI: https://doi.org/10.1007/978-3-7091-0738-6_2
Published: 20 August 2011
Publisher Name: Springer, Vienna
Print ISBN: 978-3-7091-0737-9
Online ISBN: 978-3-7091-0738-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics