Advertisement

On Estimating COUNT, SUM, and AVERAGE Relational Algebra Queries

  • G. Ozsoyoglu
  • K. Du
  • A. Tjahjana
  • W.-C. Hou
  • D. Y. Rowland

Abstract

CASE-DB is a relational database management system that allows users to specify time constraints in queries. For an aggregate query AGG(E) where AGG is one of COUNT, SUM and AVERAGE, and E is a relational algebra expression, CASE-DB uses statistical estimators to approximate the query. This paper extends our earlier work on statistical estimators of CASE-DB with the following features: (a) New statistical estimators for COUNT queries with projection, (b) Extending the methodology for SUM and AVERAGE aggregate queries, (c) New sampling plans based on systematic sampling and stratified sampling. We also present performance evaluation experiments of the estimators with the above extensions using artificial database instances.

Keywords

Systematic Sampling Simple Random Sampling Stratify Random Sampling Relational Algebra Inclusion Probability 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [BuOv 79]
    Burnham, K.P., Overton, W.S., “Robust Estimation of Population Size When Capture Probabilities Vary Among Animals”, Ecology, Vol. 60, 1979.Google Scholar
  2. [Chao 84]
    Chao, A., “Nonparametric Estimation of the Number of Classes in a Population”, Scand. J. Stat., Vol. 11, 1984.Google Scholar
  3. [Coch 77]
    Cochran, W., “Sampling Techniques”, Third Ed., John Wiley amp; Sons, Inc., 1977.Google Scholar
  4. [Good 49]
    Goodman, L., “On the Estimation of the Number of Classes in a Population”, Ann. Math. Stat., Vol. 20, 1949.Google Scholar
  5. [HoOT 88]
    Hou, W-C., Ozsoyoglu, G., Taneja, B., “Statistical Estimators for Relational Algebra Expressions”, ACM PODS Conference, March 1988.Google Scholar
  6. [HoOT 89]
    Hou, W-C., Ozsoyoglu, G. Taneja, B., “Processing Aggregate Relational Queries with Hard Time Constraints”, ACM SIGMOD Conference, May 1989.Google Scholar
  7. Ho0 91] Hou, W-C., Ozsoyoglu, G., “Statistical Estimators for Aggregate Relational Algebra Expressions”, To appear in ACM TODS Journal.Google Scholar
  8. [Olke 86]
    Olken, F., “Physical Database Support for Scientific and Statistical Databases”, Third Int. Scientific and Statistical Databases Workshop, 1986.Google Scholar
  9. [OlkR 86]
    Olken, F., Rotem, D., “Simple Random Sampling from Relational Databases”, Proc., VLDB Conf. 1986.Google Scholar
  10. [LNS 90]
    R.Lipton, J.Naughton and D. Schneider, “Practical Selectivity Estimation through Adaptive Sampling”, ACM SIGMOD, 1990.Google Scholar
  11. [LiNa 89]
    R. Lipton and J. Naughton, “Query Size Estimation by Adaptive Sampling”, ACM PODS, 1990.Google Scholar

Copyright information

© Springer-Verlag Wien 1991

Authors and Affiliations

  • G. Ozsoyoglu
    • 1
  • K. Du
    • 1
  • A. Tjahjana
    • 1
  • W.-C. Hou
    • 2
  • D. Y. Rowland
    • 3
  1. 1.Department of Computer Engineering and ScienceCase Western Reserve UniversityClevelandUSA
  2. 2.Department of Computer ScienceSouthern Illinois University at CarbondaleCarbondaleUSA
  3. 3.D. Y. Rowland AssociatesCleveland HeightsUSA

Personalised recommendations