Skip to main content

Improving Range Query Result Size Estimation Based on a New Optimal Histogram

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8132))

Abstract

Many commercial relational Data Base Management Systems (DBMSs) maintain histograms to approximate the distribution of values in the relation attributes and based on them estimate query result sizes. A histogram approximates the distribution by grouping data into buckets. The estimation-errors resulting from the loss of information during the grouping process affect the accuracy of the decision, made by query optimizers, about choosing the most economical evaluation plan for a query. In front of this challenging problem, many histogram-based estimation techniques including the equi-depth, the v-optimal, the max-diff and the compressed histograms have well contributed to approximate the cost of a query evaluation plan. But, most of the times the obtained estimates have much error. Motivated by the fact that inaccurate estimations can lead to wrong decisions, we propose in this paper an efficient algorithm, called Compressed-V2, for accurate histogram constructions. Both theoretical and effective experiments are done using benchmark data set showing the promising results obtained using the proposed algorithm. We think that this algorithm will significantly contribute for helping to solve the problem of Multi-Query Optimization (MQO) resulting from queries interactions especially in Relational Data Warehouses (RDW) which represent the ideal environment in which complex OLAP queries interact with each other.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ioannidis, Y., Poosala, V.: Balancing histogram optimality and practicality for query result size estimation. In: Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, pp. 233–244 (1995)

    Google Scholar 

  2. Jagadish, H.V., Koudas, N., Muthukrishnan, S., Poosala, V., Sevcik, K., Suel, T.: Optimal histograms with quality guarantees. In: Proceedings of the 24th International Conference on Very Large Data Bases (VLDB), New York, USA, pp. 275–286 (1998)

    Google Scholar 

  3. Poosala, V., Ioannidis, Y.E., Haas, P.J., Shekita, E.J.: Improved histograms for selectivity estimation of range predicates. In: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, pp. 294–305 (1996)

    Google Scholar 

  4. Jagadish, H.V., Jin, H., Ooi, B.C., Tan, K.-L.: Global optimization of histograms. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pp. 223–234 (2001)

    Google Scholar 

  5. Yu, C., Philip, G., Meng, W.: Distributed top-N query processing with possibly uncooperative local systems. In: Proc. 29th VLDB Conf., Berlin, Germany, pp. 117–128 (2003)

    Google Scholar 

  6. Selinger, P.G., Astrahan, M.M., Chamberlin, D.D., Lorie, R.A., Price, T.G.: Access path selection in a relational database management system. In: Proceedings of the ACM SIGMOD International Symposium on Management of Data, Boston, Mass., pp. 23–34 (June 1979)

    Google Scholar 

  7. John Oommen, B., Rueda, L.G.: An empirical comparison of histogram-like techniques for query optimization. In: Proceedings of the 2nd International Conference on Entreprise Information Systems, Stafford, UK, July 4-7, pp. 71–78 (2000)

    Google Scholar 

  8. Ioannidis, Y., Christodoulakis, S.: On the propagation of errors in the size of join results. In: Proceedings of the 1991 ACM SIGMOD Conference, Denver, CO, pp. 268–277 (May 1991)

    Google Scholar 

  9. Ioannidis, Y., Christodoulakis, S.: Optimal histograms for limiting worst-case error propagation in the estimates of query optimizers. To appear in ACM-TODS (1992)

    Google Scholar 

  10. Kooi, R.P.: The optimization of queries in relational databases. PhD thesis, Case Western Reserver University (September 1980)

    Google Scholar 

  11. Shapiro, G.P., Connell, C.: Accurate Estimation of the Number of Tuples Satisfying a Condition. In: Proceedings of ACM-SIGMOD Conference, pp. 256–276 (1984)

    Google Scholar 

  12. Ioannidis, Y.: Universality of serial histograms. In: Proceedings of the 19th Int. Conf. on Very Large Databases, pp. 256–267 (December 1993)

    Google Scholar 

  13. Poosala, V., Ioannidis, Y.: Estimation of query-result distribution and its application in parallel-join load balancing. In: Proceedings of the 22nd Int. Conf. on Very Large Databases, pp. 448–459 (1996)

    Google Scholar 

  14. Gupta, A., Sudarshan, S., Viswanathan, S.: Query scheduling in multi query optimization. In: IDEAS, pp. 11–19 (2001)

    Google Scholar 

  15. Thomas, D., Diwan, A.A., Sudarshan, S.: Scheduling and caching in multi query optimization. In: COMAD, pp. 150–153 (2006)

    Google Scholar 

  16. Kerkad, A., Bellatreche, L., Geniet, D.: Queen-Bee: Query interaction- aware for buffer allocation and scheduling problem. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2012. LNCS, vol. 7448, pp. 156–167. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  17. Ioannidis, Y.: Query optimization. In: ACM Computing Surveys, Symposium Issue on the 50th Anniversary of ACM, vol. 28, pp. 121–123 (1996)

    Google Scholar 

  18. Christodoulakis, S.: Implications of certain assumptions in database performance evaluation. ACM TODS 9(2), 163–186 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  19. Zipf, G.K.: Human Behavior and the Principle of Least Effort: an Introduction to Human Ecology. Addison-Wesley, Cambridge (1949)

    Google Scholar 

  20. Liu, Y.: Data preprocessing. Department of Biomedical, Industrial and Human Factors Engineering Wright State University (2010)

    Google Scholar 

  21. Ioannidis, Y., Poosala, V.: Histogram-based solutions to diverse database estimation problems. IEEE Data Engineering Bulletin 18(3), 10–18 (1995)

    Google Scholar 

  22. Muralikrishna, M., Dewitt, D.J.: Equi-depth histograms for estimating selectivity factors for multi-dimensional queries. In: Proceedings of ACM SIGMOD Conference, pp. 28–36 (1988)

    Google Scholar 

  23. Mousavi, H., Zaniolo, C.: Fast and Accurate Computation of Equi-Depth Histograms over Data Streams. In: Proceedings of EDBT, Uppsala, Sweden, March 22-24 (2011)

    Google Scholar 

  24. Gomes, J.S.: Adaptive Histogram Algorithms for Approximating Frequency Queries in Dynamic Data Streams. In: 12th International Conference on Internet Computing, ICOMP 2011, Las Vegas, NV, July 18-21 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Labbadi, W., Akaichi, J. (2013). Improving Range Query Result Size Estimation Based on a New Optimal Histogram. In: Larsen, H.L., Martin-Bautista, M.J., Vila, M.A., Andreasen, T., Christiansen, H. (eds) Flexible Query Answering Systems. FQAS 2013. Lecture Notes in Computer Science(), vol 8132. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40769-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40769-7_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40768-0

  • Online ISBN: 978-3-642-40769-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics