Abstract
Approximate query processing has emerged as an approach to dealing with the huge data volume and complex queries in the environment of data warehouse. In this paper, we present a novel method that provides approximate answers to OLAP queries. Our method is based on building a compressed (approximate) data cube by a clustering technique and using this compressed data cube to provide answers to queries directly, so it improves the performance of the queries. We also provide the algorithm of the OLAP queries and the confidence intervals of query results. An extensive experimental study with the OLAP council benchmark shows the effectiveness and scalability of our cluster-based approach compared to sampling.
Similar content being viewed by others
References
Gray J, Bosworth A, Layman A, Pirahesh H. DataCube: A relational aggragation operator generalizing Group-By, Gross-Tab, and Sub Totals. InProc. 12th ICDE, Neworleans, Louisiana, USA, 1996, pp.152–159.
Sarawagi S, Stonebraker M. Efficient organization of large multidimensional arrays. InProc of ICDE, Houston, Texas, USA, 1994, pp.328–336.
Han J, Kambr M. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, 2000.
The OLAP Council. The OLAP benchmark. http://www.olapcouncil.org
Barbara D, DuMouchel W, Faloutsos Cet al., The New Jersey data reduction report.IEEE Data Engincering Bulletin, 1997, 20(4): 3–45.
Acharya S, Gibbons P B, Poosala V, Ramaswamy S. Join Synopses for approximate query answering. InSIGMOD’1999, Philadelphia, Pennsylvania, USA, 1999, pp.275–286.
Vitter J S, Wang M. Approximate computation of multidimensional aggregates of sparse data using wavelets. InSIGMOD’1999, Philadelphia, Pennsylvania, USA, 1999, pp.193–204.
Shanmugasundaram J, Fayyad U, Bradley P S. Compressed Data Cubes for OLAP Aggregate Query Approximation on Continuous Dimensions. InKDD’1999 San Diego, California, USA, 1999, pp.223–232.
Jagadish H V, Madar J, Ng R T. Semantic Compression and Pattern Extraction with Fascicles. InVLDB’1999, Edinburgh, Scotland, 1999, pp.186–198.
Babu S, Garofalakis M, Rastogi R. SPARTAN: A model-based semantic compression system for massive data tables. InSIGMOD’2001, Santa Barbara, California, USA, 2001, pp.283–294.
Li J, Rotem D, Srivastava J. Aggregation algorithms for very large compressed data warehouses. InVLDB’1999, Edinburgh, Scotland, 1999, pp.651–662.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is supported by the National Natural Science, Foundation of China (Grant No.69973050) and the National NKBRSF ‘973’ Project of China (Grant No.2001CCA03000).
FENG Yu received her M.S. degree and joined the faculty of school of information, Renmin Univesity of China, in 1996. She is currently an Ph.D. candidate in the Institute of Computing Technology, the Chinese Academy of Sciences. Her research interests are data warehousing, OLAP and data mining.
For the biography ofWANG Shan please refer to P.396, No.4, Vol.17 of this journal.
Rights and permissions
About this article
Cite this article
Feng, Y., Wang, S. Compressed data cube for approximate OLAP query processing. J. Comput. Sci. & Technol. 17, 625–635 (2002). https://doi.org/10.1007/BF02948830
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02948830