Advertisement

Categorical Range Queries in Large Databases

  • Alexandros Nanopoulos
  • Panayiotis Bozanis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2750)

Abstract

In this paper, we introduce the categorical (a.k.a. chromatic) range queries (CRQs) in the context of large, disk-resident data sets, motivated by the fact that CRQs are conceptually simple and emerge often in DBMSs. On the basis of spatial data structures, and R-trees in particular, we propose a multi-tree index that follows the broad concept of augmenting nodes with additional information to accelerate queries. Augmentation is examined with respect to maximal/minimal points in subtrees, the properties of which are exploited by the proposed searching algorithm to effectively prune the search space. Detailed experimental results, with both real and synthetic data, illustrate the significant performance gains (up to an order of magnitude) due to the proposed method, compared to the regular range query (followed by the filtering w.r.t. categories) and to a naive R-tree augmentation method.

Keywords

Execution Time Domain Size Range Query Categorical Attribute Replication Factor 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Acharya, S., Poosala, V., Ramaswamy, S.: Selectivity Estimation in Spatial Databases. In: Int. Conf. on Management of Data (SIGMOD 1999), pp. 13–24 (1999)Google Scholar
  2. 2.
    Agarwal, P., Govindarajan, S., Muthukrishnan, S.: Range Searching in Categorical Data: Colored Range Searching on Grid. In: Möhring, R.H., Raman, R. (eds.) ESA 2002. LNCS, vol. 2461, pp. 17–28. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  3. 3.
    Antoshenkov, G.: Random Sampling from Pseudo-ranked B + -trees. In: Int. Conf. on Very Large Databases (VLDB 1992), pp. 375–382 (1992)Google Scholar
  4. 4.
    Arge, L., Vitter, J.S.: Optimal Dynamic Interval Management in External Memory. In: Symp. on Foundations of Computer Science (FOCS 1996), pp. 560–569 (1996)Google Scholar
  5. 5.
    Arge, L., Samoladas, V., Vitter, J.S.: On Two-Dimensional Indexability and Optimal Range Search Indexing. In: Symp. on Principles of Database Systems (PODS 1999), pp. 346–357 (1999)Google Scholar
  6. 6.
    Barbour, A.D., Xia, A.: The number of two-dimensional maxima. Advanced Applications on Probability (SGSA) 33, 727–750 (2001)zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Beckmann, N., Kriegel, H.-P., Schneider, R., Seeger, B.: The R ∗ -Tree: An Efficient and Robust Access Method for Points and Rectangles. In: Int. Conf. on Management of Data (SIGMOD 1990), pp. 322–331 (1990)Google Scholar
  8. 8.
    de Berg, M., van Kreveld, M., Overmars, M., Schwarzkopf, O.: Computational Geometry, 2nd edn. Springer, Heidelberg (2000)zbMATHGoogle Scholar
  9. 9.
    Bozanis, P., Kitsios, N., Makris, C., Tsakalidis, A.: New Upper Bounds for Generalized Intersection Searching Problems. In: Fülöp, Z., Gecseg, F. (eds.) ICALP 1995. LNCS, vol. 944, pp. 464–474. Springer, Heidelberg (1995)Google Scholar
  10. 10.
    Bozanis, P., Kitsios, N., Makris, C., Tsakalidis, A.: Red-Blue Intersection Reporting for Objects of Non-Constant Size. The Computer Journal 39(6), 541–546 (1996)CrossRefGoogle Scholar
  11. 11.
    Bozanis, P., Kitsios, N., Makris, C., Tsakalidis, A.: New Results on Intersection Query Problems. The Computer Journal 40(1), 22–29 (1997)CrossRefMathSciNetGoogle Scholar
  12. 12.
    Bozanis, P., Nanopoulos, A., Manolopoulos, Y.: LR-tree: a Logarithmic Decomposable Spatial Index Method. The Computer Journal (2003) (to appear)Google Scholar
  13. 13.
    Bruno, N., Chaudhuri, S., Gravano, L.: Top-k selection queries over relational databases: Mapping strategies and performance evaluation. ACM Transactions on Database Systems 27(2), 153–187 (2002)CrossRefGoogle Scholar
  14. 14.
    Eppstein, D., Muthukrishnan, S.: Internet packet filter management and rectangle geometry. In: Symp. on Discrete Algorithms (SODA 2001), pp. 827–835 (2001)Google Scholar
  15. 15.
    Ferragina, P., Koudas, N., Muthukrishnan, S., Srivastava, D.: Two-dimensional Substring Indexing. In: Symp. on Principles of Database Systems (PODS 2001), pp. 282–288 (2001)Google Scholar
  16. 16.
    Govindarajan, S., Agarwal, P.K., Arge, L.: CRB-Tree: An Efficient Indexing Scheme for Range-Aggregate Queries. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds.) ICDT 2003. LNCS, vol. 2572, pp. 143–157. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  17. 17.
    Gupta, P., Janardan, R., Smid, M.: Further Results on Generalized Intersection Searching Problems: Counting, Reporting, and Dynamization. Journal of Algorithms 19(2), 282–317 (1995)zbMATHCrossRefMathSciNetGoogle Scholar
  18. 18.
    Gupta, P., Janardan, R., Smid, M.: Efficient Algorithms for Generalized Inter secion Searching on Non-Iso-oriented Objects. Computational Geometry: Theory & Applications 6(1), 1–19 (1996)zbMATHCrossRefMathSciNetGoogle Scholar
  19. 19.
    Hellerstein, J., Koutsoupias, E., Miranker, D., Papadimitriou, C., Samolodas, V.: On a Model of Indexability and Its Bounds for Range Queries. Journal of the ACM 19(1), 35–55 (2002)CrossRefGoogle Scholar
  20. 20.
    Janardan, R., Lopez, M.: Generalized intersection searching problems. Int. Journal on Computational Geometry and Applications 3, 39–69 (1993)zbMATHCrossRefMathSciNetGoogle Scholar
  21. 21.
    Kanth, K., Ravada, S., Abugov, D.: Quadtree and R-tree indexes in oracle spatial: a comparison using GIS data. In: Int. Conf. on Management of Data (SIGMOD 2002), pp. 546–557 (2002)Google Scholar
  22. 22.
    Manolopoulos, Y., Nardelli, E., Proietti, G., Tousidou, E.: A generalized comparison of linear representations of thematic layers. Data and Knowledge Engineering 37(1), 1–23 (2001)zbMATHCrossRefGoogle Scholar
  23. 23.
    Mount, D., Netanyahu, N., Silverman, R., Wu, A.: Chromatic nearest neighbor searching: A query sensitive approach. Computational Geometry 17(3-4), 97–119 (2000)zbMATHCrossRefMathSciNetGoogle Scholar
  24. 24.
    Muthukrishnan, S.: Efficient algorithms for document retrieval problems. In: Symp. on Discrete Algorithms (SODA 2002), pp. 657–666 (2002)Google Scholar
  25. 25.
    Nanopoulos, A., Theodoridis, Y., Manolopoulos, Y.: An Efficient and Effective Algorithm for Density Biased Sampling. In: Int. Conf. on Information and Knowledge Management (CIKM 2002), pp. 63–68 (2002)Google Scholar
  26. 26.
    Overmars, M.H.: The Design of Dynamic Data Structures. Springer, Heidelberg (1983)zbMATHGoogle Scholar
  27. 27.
    Papadias, D., Tao, Y., Kalnis, P., Zhang, J.: Indexing Spatio-Temporal Data Warehouses. In: Int. Conf. on Data Engineering (ICDE 2002) (2002)Google Scholar
  28. 28.
    Sack, J.R., Urrutia, J. (eds.): Handbook of Computational Geometry. North-Holland, Amsterdam (2000)zbMATHGoogle Scholar
  29. 29.
    Tao, Y., Papadias, D.: The MV3R-Tree: A Spatio-Temporal Access Method for Timestamp and Interval Queries. In: Intl. Conf. on Very Large Data Bases (VLDB 2001), pp. 431–440 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Alexandros Nanopoulos
    • 1
  • Panayiotis Bozanis
    • 2
  1. 1.Dept. InformaticsAristotle UniversityThessalonikiGreece
  2. 2.Dept. Computer Eng. and TelecomUniversity of ThessalyVolosGreece

Personalised recommendations