Abstract
On-line analytical processing provides multidimensional data analysis, through extensive computation based on aggregation, along many dimensions and hierarchies. To accelerate query-response time, pre-computed results are often stored for later retrieval. This adds a prohibitive storage overhead when applied to the whole set of aggregates. In this paper we describe a novel approach which provides the means for the efficient selection, computation and storage of multidimensional aggregates. The approach identifies redundant aggregates, by inspection, thus allowing only distinct aggregates to be computed and stored. We propose extensions to relational theory and also present new algorithms for implementing the approach, providing a solution which is both scalable and low in complexity. The experiments were conducted using real and synthetic datasets and demonstrate that significant savings in computation time and storage space can be achieved when redundant aggregates are eliminated. Savings have also been shown to increase as dimensionality increases. Finally, the implications of this work affect the indexing and maintenance of views and the user interface.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Agarwal, R. Agrawal, P.M. Deshpande, A. Gupta, J.F. Naughton, R. Ramakrishnan, and S. Sarawagi. On the Computation of Multidimensional Aggregates. In Proceedings of the 22nd International Conference on Very Large Databases, pages 506–521, Mumbai, Sept. 1996.
E. Baralis, S. Paraboschi, E. Teniente. Materialized View Selection in a Multidimensional Database. Proceedings of the 23rd International Conference on Very Large Databases, pages 156–165, Athens 1997.
D. Barbara, M. Sullivan. Quasi-Cubes: A space-efficient way to support approximate multidimensional databases. Technical Report-Dept of Inf and Soft. Eng. George Mason University 1998.
K., Beyer, R., Ramakrishnan Bottom-Up Computation and Iceberg CUBEs. Proc. ACM SIGMOD International. Conf. on Management of Data 1999, pages 359–370, Philadelphia PA, USA, June 1999.
E.F., Codd, “A relational model for large shared data banks. Communications of the ACM, 13(6):377–387, 1970.
E.F., Codd, S.B. Codd, C.T. Salley. Providing OLAP (On-Line Analytical Processing) to User Analyst: An IT Mandate. Arbor Software at http://www.arborsoft.com/OLAP.html.
S. Chaudhuri, U. Dayal. An Overview of Data Warehousing and OLAP Technology. Technical Report MSR-TR-97-14, Microsoft Research Advanced Technology, March 1997.
P.M. Despande, A. Shukla, J.F. Naughton, K. Ramaswamy. Storage Estimation of the Multidimensional Aggregates. In Proceedings of the 22nd International Conference on Very Large Databases, pages 522–531, Mumbai, Sept. 1996.
J. Gray, A. Bosworth, A. Layman, and H. Pirahesh. Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals. Proceedings of the 21st International Conference on Data Engineering. New Orleans (LA), USA, Feb. 26-March 1), 1996.
H. Gupta. Selection of Views to Materialize in a Data Warehouse. Proc. of the 6th International Conference in Database Theory (ICDT), pages 98–112, Delphi, Jan 1997.
V. Harinarayan, A. Rajaraman, J.D. Ullman. Implementing Data Cubes Efficiently. Proceedings of ACM SIGMOD International Conference on Management of Data, pages 205–227, 1996.
C.J. Hahn, S.G. Warren, J. London. Edited synoptic cloud reports over the globe 1982–1991. Available from http://cdiac.esd.ornl.gov/cdiac/ndps/ndp026b.html, 1994.
R. Kimball. The Data Warehouse Toolkit. John Wiley 1996.
N. Kotsis, D.R. McGregor. Compact Representation: An Efficient Implementation for the Data Warehouse Architecture. 1st International Conference, DaWAK’ 99 pages 78–85. Florence, Italy, August 99.
N. Kotsis. Multidimensional Aggregation in OLAP systems. PhD thesis. Department of Computer Science, University of Strathclyde, February 2000.
C. Lucchesi, S. Osborn. Candidate keys for relations. J. Computer and System Science, 17(2:270–279, 1978.
O’Neil, P., Graefe, G., Multi-Table Joins through Bitmapped Join Indices. Proceedings of ACM SIGMOD International Conference on Management of Data, 1996.
K.A. Ross, D. Srivastava Fast Computation of Sparse Datacubes. Proc. of the 23rd International Conference on Very Large Databases, pages 116–125, Athens 1997.
S. Sarawagi, R. Agrawal, A. Gupta. On Computing the Data Cube. it Research report 10026, IBM Almaden Research Center, San Jose, California, 1996.
A. Shukla, P.M. Despande, J.F. Naughton. Materialized View Selection for Multidimensional Datasets, Proc. of 24rd International Conference on Very Large Databases, pages 488–499, New York 1998.
F. Raab, editor. TPC Benchmark TM-Standard Specification Revision 1.3.1 Transaction Processing Council 1998
J.S. Vitter and M. Wang. Approximate Computation of Multidimensional Aggregates of Sparse Data Using Wavelets. Proc. ACM SIGMOD International Conference on Management of Data 1999, p. 193–204, Philadelphia PA, USA, June 1999.
J. Widom. Research Problems in Data Warehousing. Proc. of the 4th Intl. Conference of CIKM, pages 25–30, Nov. 1995.
M.C. Wu and A. Buchmann. Encoded bitmap indexing for dataWarehouses. International Conference on data Engineering, pages 220–230, 1998.
Y. Zhao, P.M. Deshpande, J.F. Naughton. An Array based Algorithm for Simultaneous Multidimensional Aggregates, Proceedings of ACM SIGMOD International Conference On Management of Data, pages 159–170, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kotsis, N., McGregor, D.R. (2000). Elimination of Redundant Views in Multidimensional Aggregates. In: Kambayashi, Y., Mohania, M., Tjoa, A.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2000. Lecture Notes in Computer Science, vol 1874. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44466-1_15
Download citation
DOI: https://doi.org/10.1007/3-540-44466-1_15
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67980-6
Online ISBN: 978-3-540-44466-4
eBook Packages: Springer Book Archive