Skip to main content

PL-Tree: An Efficient Indexing Method for High-Dimensional Data

  • Conference paper
Book cover Advances in Spatial and Temporal Databases (SSTD 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8098))

Included in the following conference series:

Abstract

The quest for processing data in high-dimensional space has resulted in a number of innovative indexing mechanisms. Choosing an appropriate indexing method for a given set of data requires careful consideration of data properties, data construction methods, and query types. We present a new indexing method to support efficient point queries, range queries, and k-nearest neighbor queries. Our method indexes objects dynamically using algebraic techniques, and it can substantially reduce the negative impacts of the “curse of dimensionality”. In particular, our method partitions the data space recursively into hypercubes of certain capacity and labels each hypercube using the Cantor pairing function, so that all objects in the same hypercube have the same label. The bijective property and the computational efficiency of the Cantor pairing function make it possible to efficiently map between high-dimensional vectors and scalar labels. The partitioning and labeling process splits a subspace if the data items contained in it exceed its capacity. From the data structure point of view, our method constructs a tree where each parent node contains a number of labels and child pointers, and we call it a PL-tree. We compare our method with popular indexing algorithms including R*-tree, X-tree, quad-tree, and iDistance. Our numerical results show that the dynamic PL-tree indexing significantly outperforms the existing indexing mechanisms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arge, L., de Berg, M., Haverkort, H.J., Yi, K.: The priority R-tree: A practically efficient and worst-case optimal R-tree. In: Proceedings of ACM/SIGMOD Annual Conference on Management of Data (SIGMOD), pp. 347–358 (2004)

    Google Scholar 

  2. Beckmann, N., Kriegel, H.-P., Schneider, R., Seeger, B.: The R*-tree: An efficient and robust access method for points and rectangles. In: Proceedings of ACM/SIGMOD Annual Conference on Management of Data (SIGMOD), pp. 322–331 (1990)

    Google Scholar 

  3. Berchtold, S., Böhm, C., Kriegel, H.-P.: The pyramid-technique: Towards breaking the curse of dimensionality. In: Proceedings of ACM/SIGMOD Annual Conference on Management of Data (SIGMOD), pp. 142–153 (1998)

    Google Scholar 

  4. Berchtold, S., Keim, D.A., Kriegel, H.-P.: The X-tree: An index structure for high-dimensional data. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 28–39 (1996)

    Google Scholar 

  5. Böhm, C., Berchtold, S., Keim, D.A.: Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Comput. Surv. 33(3), 322–373 (2001)

    Article  Google Scholar 

  6. Cantor, G.: Contributions to the Founding of the Theory of Transfinite Numbers. Dover, New York (1955); Original year was 1915

    Google Scholar 

  7. Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 426–435 (1997)

    Google Scholar 

  8. Corral, A., Cañadas, J., Vassilakopoulos, M.: Processing distance-based queries in multidimensional data spaces using r-trees. In: Manolopoulos, Y., Evripidou, S., Kakas, A.C. (eds.) PCI 2001. LNCS, vol. 2563, pp. 1–18. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  9. Fonseca, M.J., Jorge, J.A.: Indexing high-dimensional data for content-based retrieval in large databases. In: Proceedings of International Conference on Database Systems for Advanced Applications (DASFAA), pp. 267–274 (2003)

    Google Scholar 

  10. Gaede, V., Günther, O.: Multidimensional access methods. ACM Comput. Surv. 30(2), 170–231 (1998)

    Article  Google Scholar 

  11. Guttman, A.: R-trees: A dynamic index structure for spatial searching. In: Proceedings of ACM/SIGMOD Annual Conference on Management of Data (SIGMOD), pp. 47–57 (1984)

    Google Scholar 

  12. Hjaltason, G.R., Samet, H.: Distance browsing in spatial databases. ACM Trans. Database Syst. 24(2), 265–318 (1999)

    Article  Google Scholar 

  13. Hoel, E.G., Samet, H., Tree, R.: Benchmarking spatial join operations with spatial output. In: Proceedings of the 21st International Conference on Very Large Data Bases, pp. 606–618 (1998)

    Google Scholar 

  14. Jagadish, H.V., Ooi, B.C., Tan, K.-L., Yu, C., Zhang, R.: idistance: An adaptive b+-tree based indexing method for nearest neighbor search. ACM Trans. Database Syst. 30, 364–397 (2005)

    Article  Google Scholar 

  15. Kamel, I., Faloutsos, C.: Hilbert R-tree: An improved R-tree using fractals. In: VLDB, pp. 500–509 (1994)

    Google Scholar 

  16. Katayama, N., Satoh, S.: The SR-tree: An index structure for high-dimensional nearest neighbor queries. In: Proceedings of ACM/SIGMOD Annual Conference on Management of Data (SIGMOD), pp. 369–380 (1997)

    Google Scholar 

  17. Kim, Y.J., Patel, J.: Performance comparison of the r*-tree and the quadtree for knn and distance join queries. IEEE Transactions on Knowledge and Data Engineering 22(7), 1014–1027 July

    Google Scholar 

  18. Kothuri, R.K.V., Ravada, S., Abugov, D.: Quadtree and r-tree indexes in oracle spatial: a comparison using gis data. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, SIGMOD 2002, pp. 546–557. ACM, New York (2002)

    Chapter  Google Scholar 

  19. Leutenegger, S., Lopez, M., Edgington, J.: Str: a simple and efficient algorithm for r-tree packing. In: Proceedings of the13th International Conference on Data Engineering, pp. 497–506 (April 1997)

    Google Scholar 

  20. Lin, K.-I., Jagadish, H.V., Faloutsos, C.: The TV-tree: An index structure for high-dimensional data. VLDB Journal 3(4), 517–542 (1994)

    Article  Google Scholar 

  21. Nievergelt, J., Hinterberger, H., Sevcik, K.C.: The grid file: An adaptable, symmetric multikey file structure. ACM Trans. Database Syst. 9(1), 38–71 (1984)

    Article  Google Scholar 

  22. Ooi, B.C., Tan, K.-L., Yu, C., Bressan, S.: Indexing the edges - a simple and yet efficient approach to high-dimensional indexing. In: Proceedings of the Nineteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2000, pp. 166–174. ACM, New York (2000)

    Chapter  Google Scholar 

  23. Sakurai, Y., Yoshikawa, M., Uemura, S., Kojima, H.: The A-tree: An index structure for high-dimensional spaces using relative approximation. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 516–526 (2000)

    Google Scholar 

  24. Samet, H., Webber, R.E.: Storing a collection of polygons using quadtrees. ACM Trans. Graph. 4(3), 182–222 (1985)

    Article  Google Scholar 

  25. Sellis, T.K., Roussopoulos, N., Faloutsos, C.: The R+-tree: A dynamic index for multi-dimensional objects. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 507–518 (1987)

    Google Scholar 

  26. Shimazaki, H., Shinomoto, S.: Kernel bandwidth optimization in spike rate estimation. Journal of Computational Neuroscience 29(1-2), 171–182 (2010)

    Article  MathSciNet  Google Scholar 

  27. Weber, R., Schek, H.-J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 194–205 (1998)

    Google Scholar 

  28. White, D.A., Jain, R.: Similarity indexing with the SS-tree. In: Proceedings of International Conference on Data Engineering (ICDE), pp. 516–523 (1996)

    Google Scholar 

  29. Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: Proceedings of Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 311–321 (1993)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, J., Lu, J., Fang, Z., Ge, T., Chen, C. (2013). PL-Tree: An Efficient Indexing Method for High-Dimensional Data. In: Nascimento, M.A., et al. Advances in Spatial and Temporal Databases. SSTD 2013. Lecture Notes in Computer Science, vol 8098. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40235-7_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40235-7_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40234-0

  • Online ISBN: 978-3-642-40235-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics