Skip to main content

Self-tuning UDF Cost Modeling Using the Memory-Limited Quadtree

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2992))

Abstract

Query optimizers in object-relational database management systems require users to provide the execution cost models of user-defined functions(UDFs). Despite this need, however, there has been little work done to provide such a model. Furthermore, none of the existing work is self-tuning and, therefore, cannot adapt to changing UDF execution patterns. This paper addresses this problem by introducing a self-tuning cost modeling approach based on the quadtree. The quadtree has the inherent desirable properties to (1) perform fast retrievals, (2) allow for fast incremental updates (without storing individual data points), and (3) store information at different resolutions. We take advantage of these properties of the quadtree and add the following in order to make the quadtree useful for UDF cost modeling: the abilities to (1) adapt to changing UDF execution patterns and (2) use limited memory. To this end, we have developed a novel technique we call the memory-limited quadtree(MLQ). In MLQ, each instance of UDF execution is mapped to a query point in a multi-dimensional space. Then, a prediction is made at the query point, and the actual value at the point is inserted as a new data point. The quadtree is then used to store summary information of the data points at different resolutions based on the distribution of the data points. This information is used to make predictions, guide the insertion of new data points, and guide the compression of the quadtree when the memory limit is reached. We have conducted extensive performance evaluations comparing MLQ with the existing (static) approach.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hellerstein, J., Stonebraker, M.: Predicate migration: Optimizing queries with expensive predicates. In: Proc. of ACM-SIGMOD, pp. 267–276 (1993)

    Google Scholar 

  2. Chaudhuri, S., Shim, K.: Optimization of queries with user-defined predicates. In: Proc. of ACM SIGMOD, pp. 87–98 (1996)

    Google Scholar 

  3. Jihad, B., Kinji, O.: Cost estimation of user-defined methods in object-relational database systems. SIGMOD Record, 22–28 (1999)

    Google Scholar 

  4. Boulos, J., Viemont, Y., Ono, K.: A neural network approach for query cost evaluation. Trans. on Information Processing Society of Japan, 2566–2575 (1997)

    Google Scholar 

  5. Hellerstein, J.: Practical predicate placement. In: Proc. of ACM SIGMOD, pp. 325–335 (1994)

    Google Scholar 

  6. Aboulnaga, A., Chaudhuri, S.: Self-tuning histograms: building histograms without looking at data. In: Proc. of ACM SIGMOD, pp. 181–192 (1999)

    Google Scholar 

  7. Bruno, N., Chaudhuri, S., Gravano, L.: STHoles: A mulidimensional workloadaware histogram. In: Proc. of ACM SIGMOD, pp. 211–222 (2001)

    Google Scholar 

  8. Stillger, M., Lohman, G., Markl, V., Kandil, M.: LEO - DB2’s LEarning optimizer. In: Proc. of VLDB, pp. 19–28 (2001)

    Google Scholar 

  9. Hunter, G.M., Steiglitz, K.: Operations on images using quadtrees. IEEE Trans. on Pattern Analysis and Machine Intelligence 1, 145–153 (1979)

    Article  Google Scholar 

  10. Strobach, P.: Quadtree-structured linear prediction models for image sequence processing. IEEE Trans. on Pattern Analysis and Machine Intelligence 11, 742–748

    Google Scholar 

  11. Lee, J.W.: Joint optimization of block size and quantization for quadtree-based motion estimation. IEEE Trans. on Pattern Analysis 7, 909–911 (1998)

    Google Scholar 

  12. Aref, W.G., Samet, H.: Efficient window block retrieval in quadtree-based spatial databases. GeoInformatica 1, 59–91 (1997)

    Article  Google Scholar 

  13. Wang, F.: Relational-linear quadtree approach for two-dimensional spatial representation and manipulation. IEEE Trans. on Knowledge and Data Eng. 3, 118–122 (1991)

    Article  Google Scholar 

  14. Lazaridis, I., Mehrotra, S.: Progressive approximate aggregate queries with a multi-resolution tree structure. In: Proc. of ACM SIGMOD, pp. 401–413 (2001)

    Google Scholar 

  15. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. ch. 7, vol. 303, pp. 314–315. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  16. Poosala, V., Ioannidis, Y.: Selectivity estimation without the attribute value independence assumption. In: Proc. of VLDB, pp. 486–495 (1997)

    Google Scholar 

  17. Buccafurri, F., Furfaro, F., Sacca, D., Sirangelo, C.: A quad-tree based multiresolution approach for two-dimensional summary data. In: Proc. of SSDBM, Cambridge, Massachusetts, USA (2003)

    Google Scholar 

  18. He, Z., Lee, B.S., Snapp, R.R.: Self-tuning UDF cost modeling using the memory limited quadtree. Technical Report CS-03-18, Department of Computer Science, University of Vermont (2003)

    Google Scholar 

  19. Deshpande, A., Garofalakis, M., Rastogi, R.: Independence is good: Dependency-based histogram synopses for high-dimensional data. In: Proc. of ACM SIGMOD, pp. 199–210 (2001)

    Google Scholar 

  20. Zipf, G.K.: Human behavior and the principle of least effort. Addison-Wesley, Reading (1949)

    Google Scholar 

  21. PSADA: Urban areas of pennsylvania state, http://www.pasda.psu.edu/access/urban.shtml (Last viewed:June 18, 2003)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

He, Z., Lee, B.S., Snapp, R.R. (2004). Self-tuning UDF Cost Modeling Using the Memory-Limited Quadtree. In: Bertino, E., et al. Advances in Database Technology - EDBT 2004. EDBT 2004. Lecture Notes in Computer Science, vol 2992. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24741-8_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24741-8_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21200-3

  • Online ISBN: 978-3-540-24741-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics