Abstract
In this paper, we propose s-OLAP, a framework for supporting approximate range query evaluation on data cubes that meaningfully makes use of two innovative perspectives of OLAP research, namely dimensionality reduction and probabilistic synopses. The application scenario of s-OLAP is a networked and heterogeneous very large Data Warehousing environment where applying traditional algorithms for processing OLAP queries is too much expensive and not convenient because of the size of data cubes, and the computational cost needed to access and process multidimensional data. s-OLAP relies on intelligent data representation and processing techniques, among which: (i) the amenity of exploiting the Karhunen-Loeve Transform (KLT) for obtaining dimensionality reduction of data cubes, and (ii) the definition of a probabilistic framework that allows us to provide a rigorous theoretical basis for ensuring probabilistic guarantees over the degree of approximation of the retrieved answers, which is a critical point in the context of approximate query answering techniques in OLAP.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Acharya, S., Gibbons, P.B., Poosala, V., Ramaswamy, S.: Join Synopses for Approximate Query Answering. In: Proc. of 1999 ACM SIGMOD Int. Conf., pp. 275–286 (1999)
Barbarà , D., Du Mouchel, W., Faloutsos, C., Haas, P.J., Hellerstein, J.M., Ioannidis, Y.E., Jagadish, H.V., Johnson, T., Ng, R.T., Poosala, V., Ross, K.A., Sevcik, K.C.: The New Jersey Data Reduction Report. IEEE Data Engineering Bulletin 20(4), 3–45 (1997)
Chakrabarti, K., Garofalakis, M., Rastogi, R., Shim, K.: Approximate Query Processing Using Wavelets. Very Large Data Bases Journal 10(2-3), 199–223 (2001)
Chaudhuri, S., Dayal, U.: An Overview of Data Warehousing and OLAP Technology. ACM SIGMOD Record 26(1), 65–74 (1997)
Cuzzocrea, A.: Overcoming Limitations of Approximate Query Answering in OLAP. In: Proc. of 9th IEEE IDEAS Int. Conf., pp. 200–209 (2005)
Cuzzocrea, A.: Providing Probabilistically-Bounded Approximate Answers to Non-Holistic Aggregate Range Queries in OLAP. In: Proc. of 8th ACM DOLAP Int. Works, pp. 97–106 (2005)
Cuzzocrea, A.: Accuracy Control in Compressed Multidimensional Data Cubes for Quality of Answer-based OLAP Tools. In: Proc. of 18th IEEE SSDBM Int. Conf., pp. 301–310 (2006)
Cuzzocrea, A., Wang, W.: Approximate Range-Sum Query Answering on Data Cubes with Probabilistic Guarantees. Journal of Intelligent Information Systems 28(2), 161–197 (2007)
Gibbons, P.B., Matias, Y.: New Sampling-Based Summary Statistics for Improving Approximate Query Answers. In: Proc. of 1998 ACM SIGMOD Int. Conf, pp. 331–342 (1998)
Gibbons, P.B., Matias, Y., Poosala, V.: Fast Incremental Maintenance of Approximate Histograms. ACM Transactions on Database Systems 27(3), 261–298 (2002)
Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data Cube: A Relational Aggregation Operator Generalizing Group-by, Cross-Tab, and Sub Totals. Data Mining and Knowledge Discovery 1(1), 29–53 (1997)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kauffmann Publishers, San Francisco (2000)
Han, J., Pei, J., Dong, G., Wang, K.: Efficient Computation of Iceberg Cubes with Complex Measures. In: Proc. of 2001 ACM SIGMOD Int. Conf., pp. 1–12 (2001)
Hellerstein, J.M., Haas, P.J., Wang, H.J.: Online Aggregation. In: Proc. of 1997 ACM SIGMOD Int. Conf., pp. 171–182 (1997)
Ho, C.-T., Agrawal, R., Megiddo, N., Srikant, R.: Range Queries in OLAP Data Cubes. In: Proc. of 1997 ACM SIGMOD Int. Conf., pp. 73–88 (1997)
Hoeffding, W.: Probability Inequalities for Sums of Bounded Random Variables. Journal of the American Statistical Association 58(301), 13–30 (1963)
Ioannidis, Y.E., Poosala, V.: Histogram-Based Approximation of Set-Valued Query Answers. In: Proc. of 25th VLDB Int. Conf., pp. 174–185 (1999)
Jain, A.K.: Fundamentals of Digital Image Processing. Prentice-Hall, Upper Saddle River (1989)
Poosala, V., Ganti, V.: Fast Approximate Answers to Aggregate Queries on a Data Cube. In: Proc. of IEEE 11th SSDBM Int. Conf., pp. 24–33 (1999)
Poosala, V., Ganti, V., Ioannidis, Y.E.: Approximate Query Answering using Histograms. IEEE Data Engineering Bulletin 22(4), 5–14 (1999)
Poosala, V., Ioannidis, Y.E.: Selectivity Estimation Without the Attribute Value Independence Assumption. In: Proc. of 23rd VLDB Int. Conf., pp. 486–495 (1997)
Vitter, J.S., Wang, M., Iyer, B.: Data Cube Approximation and Histograms via Wavelets. In: Proc. of 7th ACM CIKM Int. Conf., pp. 96–104 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cuzzocrea, A. (2009). s-OLAP: Approximate OLAP Query Evaluation on Very Large Data Warehouses via Dimensionality Reduction and Probabilistic Synopses. In: Filipe, J., Cordeiro, J. (eds) Enterprise Information Systems. ICEIS 2009. Lecture Notes in Business Information Processing, vol 24. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01347-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-01347-8_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01346-1
Online ISBN: 978-3-642-01347-8
eBook Packages: Computer ScienceComputer Science (R0)