Abstract
In recent years wavelet based synopses were shown to be effective for approximate queries in database systems. The simplest wavelet synopses are constructed by computing the Haar transform over a vector consisting of either the raw-data or the prefix-sums of the data, and using a greedy-heuristic to select the wavelet coefficients that are kept in the synopsis. The greedy-heuristic is known to be optimal for point queries w.r.t. the mean-squared-error, but no similar efficient optimality result was known for range-sum queries, for which the effectiveness of such synopses was only shown experimentally.
We construct an operator that defines a norm that is equivalent to the mean-squared error over all possible range-sum queries, where the norm is measured on the prefix-sums vector. We show that the Haar basis (and in fact any wavelet basis) is orthogonal w.r.t. the inner product defined by this novel operator. This allows us to use Parseval-based thresholding, and thus obtain the first linear time construction of a provably optimal wavelet synopsis for range-sum queries. We show that the new thresholding is very similar to the greedy-heuristic that is based on point queries.
For the case of range-sum queries over the raw data, we define a similar operator, and show that Haar basis is not orthogonal w.r.t. the inner product defined by this operator.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chakrabarti, K., Garofalakis, M., Rastogi, R., Shim, K.: Approximate query processing using wavelets. In: Proceedings of 26th International Conference on Very Large Data Bases, VLDB 2000, pp. 111–122 (2000)
Deligiannakis, A., Roussopoulos, N.: Extended wavelets for multiple measures. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 229–240 (2003)
Garofalakis, M., Gibbons, P.B.: Wavelet synopses with error guarantees. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (2002)
Garofalakis, M., Kumar, A.: Deterministic wavelet thresholding for maximum-error metrics. In: Proceedings of the 2004 ACM PODS International Conference on Management of Data, pp. 166–176 (2004)
Gibbons, P.B., Matias, Y.: Synopsis data structures for massive data sets. In: External Memory Algorithms. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, vol. 50. American Mathematical Society (1999)
Gilbert, A.C., Kotidis, Y., Muthukrishnan, S., Strauss, M.J.: Optimal and approximate computation of summary statistics for range aggregates. In: Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 227–236. ACM Press, New York (2001)
Hardle, W., Kerkyacharian, G., Picard, D., Tsybakov, A.: Wavelets, Approximation and Statistical Applications, vol. 129. Springer, New-York (1998)
Manku, G.S., Rajagopalan, S., Lindsay, B.G.: Approximate medians and other quantiles in one pass and with limited memory. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pp. 426–435 (1998)
Matias, Y., Portman, L.: Workload-based wavelet synopses. Technical report, Department of Computer Science, Tel Aviv University (2003)
Matias, Y., Portman, L.: τ-synopses: A system for run-time management of remote synopses. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 865–867. Springer, Heidelberg (2004)
Matias, Y., Urieli, D.: On the optimality of the greedy heuristic in wavelet synopses for range queries. Technical report, Department of Computer Science, Tel-Aviv University (2004) (revised, 2005)
Matias, Y., Urieli, D.: Optimal workload-based weighted wavelet synopses. In: Proceedings of the 2005 ICDT conference (full version in TCS, special issue of ICDT) (January 2005)
Matias, Y., Vitter, J.S., Wang, M.: Wavelet-based histograms for selectivity estimation. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pp. 448–459 (June 1998)
Meyer, Y.: Wavelets and operators. Cambridge Studies in Advanced Mathematics, vol. 37. Cambridge University Press, Cambridge (1992), Translated from the 1990 French original by D. H. Salinger
Muthukrishnan, S.: Nonuniform sparse approximation using haar wavelet basis. Technical report, DIMACS (May 2004)
Muthukrishnan, S., Strauss, M.: Rangesum histograms. In: Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, Philadelphia, PA, USA, pp. 233–242. Society for Industrial and Applied Mathematics (2003)
Portman, L.: Workload-based wavelet synopses. Master’s thesis, School of Computer Science, Tel Aviv University (2003)
Strauss, M.: Personal communication (October 2005)
Vitter, J.S., Wang, M.: Approximate computation of multidimensional aggregates of sparse data using wavelets. In: Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, pp. 193–204 (June 1999)
Vitter, J.S., Wang, M., Iyer, B.: Data cube approximation and histograms via wavelets. In: Proceedings of Seventh International Conference on Information and Knowledge Management, pp. 96–104 (November 1998)
Wang, M.: Approximation and Learning Techniques in Database Systems. PhD thesis, Duke University (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Matias, Y., Urieli, D. (2006). Inner-Product Based Wavelet Synopses for Range-Sum Queries. In: Azar, Y., Erlebach, T. (eds) Algorithms – ESA 2006. ESA 2006. Lecture Notes in Computer Science, vol 4168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11841036_46
Download citation
DOI: https://doi.org/10.1007/11841036_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38875-3
Online ISBN: 978-3-540-38876-0
eBook Packages: Computer ScienceComputer Science (R0)