Skip to main content

High-Dimensional Indexing

  • Reference work entry
  • First Online:
Encyclopedia of Database Systems

Synonyms

Indexing for similarity search

Definition

The term high-dimensional indexing [6, 9] subsumes all techniques for indexing vector spaces addressing problems which are specific in the context of high-dimensional data spaces, and all optimization techniques to improve index structures, and the algorithms for various variants of similarity search (nearest neighbor, reverse nearest neighbor queries, range queries, similarity joins, etc.) for high-dimensional spaces. The well-known curse of dimensionality leads to a worsening of the index selectivity with increasing dimensionality of the data space, an effect which already starts at dimensions of 10–15, also depending on the size of the database and the data distribution (clustering, attribute dependencies). During query processing, large parts of conventional hierarchical indexes (e.g., R-tree) need to be randomly accessed, which is by a factor of up to 20 more expensive than sequential reading operations. Therefore, specialized...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Berchtold S, Böhm C, Kriegel H-P. The pyramid-technique: towards breaking the curse of dimensionality. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1998. p. 142–53.

    Google Scholar 

  2. Berchtold S, Böhm C, Jagadish HV, Kriegel HP, Sander J. Independent quantization: an index compression technique for high-dimensional data spaces. In: Proceedings of the 16th International Conference on Data Engineering; 2000. p. 577–88.

    Google Scholar 

  3. Berchtold S, Böhm C, Keim DA, Kriegel H-P. A cost model for nearest neighbor search in high-dimensional data space. In: Proceedings of the 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 1997. p. 78–86.

    Google Scholar 

  4. Berchtold S, Böhm C, Keim DA, Kriegel H-P, Xu X. Optimal multidimensional query processing using tree striping. In: Proceedings of the 2nd International Conference Data Warehousing and Knowledge Discovery; 2000. p. 244–57.

    Chapter  Google Scholar 

  5. Berchtold S, Keim DA, Kriegel H-P. The x-tree: an index structure for high-dimensional data. In: Proceedings of the 22nd International Conference on Very Large Data Bases; 1996. p. 28–39.

    Google Scholar 

  6. Beyer KS, Goldstein J, Ramakrishnan R, Shaft U. When is “nearest neighbor” meaningful? In: Proceedings of the 7th International Conference on Database Theory; 1999. p. 217–35.

    Google Scholar 

  7. Böhm C. A cost model for query processing in high dimensional data spaces. ACM Trans Database Syst. 2000;25(2):129–78.

    Article  Google Scholar 

  8. Böhm C, Kriegel H-P. Dynamically optimizing high-dimensional index structures. In: Advances in Database Technology, Proceedings of the 7th International Conference on Extending Database Technology; 2000. p. 36–50.

    Chapter  Google Scholar 

  9. Böhm C, Berchtold S, Keim DA. Searching in high-dimensional spaces: index structures for improving the performance of multimedia databases. ACM Comput Surv. 2001;33(3):322–73.

    Article  Google Scholar 

  10. Chang Y-C, Bergman LD, Castelli V, Li C-S, Lo M-L, Smith JR. The onion technique: indexing for linear optimization queries. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2000. p. 391–402.

    Article  Google Scholar 

  11. Cui B, Ooi BC, Su J, Tan KL. Indexing high-dimensional data for efficient in-memory similarity search. IEEE Trans Knowl Data Eng (TKDE). 2005;17(3):339–53.

    Article  Google Scholar 

  12. Ferhatosmanoglu H, Agrawal D, Abbadi AE. Concentric hyperspaces and disk allocation for fast parallel range searching. In: Proceedings of the 15th International Conference on Data Engineering; 1999. p. 608–15.

    Google Scholar 

  13. Günnemann S, Kremer H, Lenhard D Seidl T. Subspace clustering for indexing high dimensional data: a main memory index based on local reductions and individual multi-representations. In: Proceedings of the International Conference on Extending Database Technology; 2011. p. 237–48.

    Google Scholar 

  14. Guttman A. R-trees: a dynamic index structure for spatial searching. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1984. p. 47–57.

    Google Scholar 

  15. Heisterkamp DR, Peng J. Kernel vector approximation files for relevance feedback retrieval in large image databases. Multimed Tools Appl. 2005;26(2):175–89.

    Article  Google Scholar 

  16. Jin H, Ooi BC, Shen HT, Yu C, Zhou A. An adaptive and efficient dimensionality reduction algorithm for high-dimensional indexing. In: Proceedings of the 19th International Conference on Data Engineering; 2003. p. 87–98.

    Google Scholar 

  17. Katayama N, Satoh S. The SR-tree: an index structure for high-dimensional nearest neighbor queries. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1997. p. 369–80.

    Google Scholar 

  18. Kim C, Chhugani J, Satish N, Sedlar E, Nguyen AD, Kaldewey T, Lee VW, Brandt SA, Dubey P. FAST: fast architecture sensitive tree search on modern CPUs and GPUs. In: Proceeding of the ACM SIGMOD International Conference on Management of Data; 2010. p. 339–50.

    Google Scholar 

  19. Leis V, Kemper A, Neumann T. The adaptive radix tree: ARTful indexing for main-memory databases. In: Proceedings of the International Conference on Data Engineering; 2013. p. 38–49.

    Google Scholar 

  20. Levandoski JJ, Lomet DB Sengupta S. The Bw-Tree: a B-tee for new hardware platforms. In: Proceedings of the 29th International Conference on Data Engineering; 2013. p. 302–13.

    Google Scholar 

  21. Lin K-I, Jagadish HV, Faloutsos C. The tv-tree: an index structure for high-dimensional data. VLDB J. 1994;3(4):517–42.

    Article  Google Scholar 

  22. Moise D, Shestakov D, Gudmundsson G, Amsaleg A. Indexing and searching 100M images with map-reduce. In: Proceedings of the 3rd ACM International Conference on Multimedia Retrieval; 2013. p. 17–24.

    Google Scholar 

  23. Sakurai Y, Yoshikawa M, Uemura S, Kojima H. The A-tree: an index structure for high-dimensional spaces using relative approximation. In: Proceedings of the 26th International Conference on Very Large Data Bases; 2000. p. 516–26.

    Google Scholar 

  24. Weber R, Böhm K, Schek H-J. Interactive-time similarity search for large image collections using parallel VA-files. In: Proceedings of the 4th European Conference Research and Advanced Technology for Digital Libraries; 2000. p. 83–92.

    Chapter  Google Scholar 

  25. Weber R, Schek H-J, Blott S. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of the 24th International Conference on Very Large Data Bases; 1998. p. 194–205.

    Google Scholar 

  26. White DA, Jain R. Similarity indexing with the ss-tree. In: Proceedings of the 12th International Conference on Data Engineering; 1996. p. 516–23.

    Google Scholar 

  27. Yu C, Ooi BC, Tan K-L, Jagadish HV. Indexing the distance: an efficient method to KNN processing. In: Proceedings of the 27th International Conference on Very Large Data Bases; 2001. p. 421–30.

    Google Scholar 

  28. Wang J, Wu S, Gao H, Li J, Ooi B.C. Indexing multi-dimensional data in a cloud system. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2010. p. 591–602.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Böhm .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Böhm, C., Plant, C. (2018). High-Dimensional Indexing. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_804

Download citation

Publish with us

Policies and ethics