High-Dimensional Indexing

Böhm, Christian; Plant, Claudia

doi:10.1007/978-1-4614-8265-9_804

Christian Böhm³ &
Claudia Plant⁴

44 Accesses

Synonyms

Indexing for similarity search

Definition

The term high-dimensional indexing [6, 9] subsumes all techniques for indexing vector spaces addressing problems which are specific in the context of high-dimensional data spaces, and all optimization techniques to improve index structures, and the algorithms for various variants of similarity search (nearest neighbor, reverse nearest neighbor queries, range queries, similarity joins, etc.) for high-dimensional spaces. The well-known curse of dimensionality leads to a worsening of the index selectivity with increasing dimensionality of the data space, an effect which already starts at dimensions of 10–15, also depending on the size of the database and the data distribution (clustering, attribute dependencies). During query processing, large parts of conventional hierarchical indexes (e.g., R-tree) need to be randomly accessed, which is by a factor of up to 20 more expensive than sequential reading operations. Therefore, specialized...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 4,499.99; Price excludes VAT (USA)

Hardcover Book: USD 6,499.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

Berchtold S, Böhm C, Kriegel H-P. The pyramid-technique: towards breaking the curse of dimensionality. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1998. p. 142–53.
Google Scholar
Berchtold S, Böhm C, Jagadish HV, Kriegel HP, Sander J. Independent quantization: an index compression technique for high-dimensional data spaces. In: Proceedings of the 16th International Conference on Data Engineering; 2000. p. 577–88.
Google Scholar
Berchtold S, Böhm C, Keim DA, Kriegel H-P. A cost model for nearest neighbor search in high-dimensional data space. In: Proceedings of the 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 1997. p. 78–86.
Google Scholar
Berchtold S, Böhm C, Keim DA, Kriegel H-P, Xu X. Optimal multidimensional query processing using tree striping. In: Proceedings of the 2nd International Conference Data Warehousing and Knowledge Discovery; 2000. p. 244–57.
Chapter Google Scholar
Berchtold S, Keim DA, Kriegel H-P. The x-tree: an index structure for high-dimensional data. In: Proceedings of the 22nd International Conference on Very Large Data Bases; 1996. p. 28–39.
Google Scholar
Beyer KS, Goldstein J, Ramakrishnan R, Shaft U. When is “nearest neighbor” meaningful? In: Proceedings of the 7th International Conference on Database Theory; 1999. p. 217–35.
Google Scholar
Böhm C. A cost model for query processing in high dimensional data spaces. ACM Trans Database Syst. 2000;25(2):129–78.
Article Google Scholar
Böhm C, Kriegel H-P. Dynamically optimizing high-dimensional index structures. In: Advances in Database Technology, Proceedings of the 7th International Conference on Extending Database Technology; 2000. p. 36–50.
Chapter Google Scholar
Böhm C, Berchtold S, Keim DA. Searching in high-dimensional spaces: index structures for improving the performance of multimedia databases. ACM Comput Surv. 2001;33(3):322–73.
Article Google Scholar
Chang Y-C, Bergman LD, Castelli V, Li C-S, Lo M-L, Smith JR. The onion technique: indexing for linear optimization queries. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2000. p. 391–402.
Article Google Scholar
Cui B, Ooi BC, Su J, Tan KL. Indexing high-dimensional data for efficient in-memory similarity search. IEEE Trans Knowl Data Eng (TKDE). 2005;17(3):339–53.
Article Google Scholar
Ferhatosmanoglu H, Agrawal D, Abbadi AE. Concentric hyperspaces and disk allocation for fast parallel range searching. In: Proceedings of the 15th International Conference on Data Engineering; 1999. p. 608–15.
Google Scholar
Günnemann S, Kremer H, Lenhard D Seidl T. Subspace clustering for indexing high dimensional data: a main memory index based on local reductions and individual multi-representations. In: Proceedings of the International Conference on Extending Database Technology; 2011. p. 237–48.
Google Scholar
Guttman A. R-trees: a dynamic index structure for spatial searching. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1984. p. 47–57.
Google Scholar
Heisterkamp DR, Peng J. Kernel vector approximation files for relevance feedback retrieval in large image databases. Multimed Tools Appl. 2005;26(2):175–89.
Article Google Scholar
Jin H, Ooi BC, Shen HT, Yu C, Zhou A. An adaptive and efficient dimensionality reduction algorithm for high-dimensional indexing. In: Proceedings of the 19th International Conference on Data Engineering; 2003. p. 87–98.
Google Scholar
Katayama N, Satoh S. The SR-tree: an index structure for high-dimensional nearest neighbor queries. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1997. p. 369–80.
Google Scholar
Kim C, Chhugani J, Satish N, Sedlar E, Nguyen AD, Kaldewey T, Lee VW, Brandt SA, Dubey P. FAST: fast architecture sensitive tree search on modern CPUs and GPUs. In: Proceeding of the ACM SIGMOD International Conference on Management of Data; 2010. p. 339–50.
Google Scholar
Leis V, Kemper A, Neumann T. The adaptive radix tree: ARTful indexing for main-memory databases. In: Proceedings of the International Conference on Data Engineering; 2013. p. 38–49.
Google Scholar
Levandoski JJ, Lomet DB Sengupta S. The Bw-Tree: a B-tee for new hardware platforms. In: Proceedings of the 29th International Conference on Data Engineering; 2013. p. 302–13.
Google Scholar
Lin K-I, Jagadish HV, Faloutsos C. The tv-tree: an index structure for high-dimensional data. VLDB J. 1994;3(4):517–42.
Article Google Scholar
Moise D, Shestakov D, Gudmundsson G, Amsaleg A. Indexing and searching 100M images with map-reduce. In: Proceedings of the 3rd ACM International Conference on Multimedia Retrieval; 2013. p. 17–24.
Google Scholar
Sakurai Y, Yoshikawa M, Uemura S, Kojima H. The A-tree: an index structure for high-dimensional spaces using relative approximation. In: Proceedings of the 26th International Conference on Very Large Data Bases; 2000. p. 516–26.
Google Scholar
Weber R, Böhm K, Schek H-J. Interactive-time similarity search for large image collections using parallel VA-files. In: Proceedings of the 4th European Conference Research and Advanced Technology for Digital Libraries; 2000. p. 83–92.
Chapter Google Scholar
Weber R, Schek H-J, Blott S. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of the 24th International Conference on Very Large Data Bases; 1998. p. 194–205.
Google Scholar
White DA, Jain R. Similarity indexing with the ss-tree. In: Proceedings of the 12th International Conference on Data Engineering; 1996. p. 516–23.
Google Scholar
Yu C, Ooi BC, Tan K-L, Jagadish HV. Indexing the distance: an efficient method to KNN processing. In: Proceedings of the 27th International Conference on Very Large Data Bases; 2001. p. 421–30.
Google Scholar
Wang J, Wu S, Gao H, Li J, Ooi B.C. Indexing multi-dimensional data in a cloud system. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2010. p. 591–602.
Google Scholar

Download references

Author information

Authors and Affiliations

University of Munich, Munich, Germany
Christian Böhm
University of Vienna, Vienna, Austria
Claudia Plant

Authors

Christian Böhm
View author publications
You can also search for this author in PubMed Google Scholar
Claudia Plant
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian Böhm .

Editor information

Editors and Affiliations

Georgia Institute of Technology College of Computing, Atlanta, GA, USA
Ling Liu
University of Waterloo School of Computer Science, Waterloo, ON, Canada
M. Tamer Özsu

Section Editor information

Dept. of Computer Science and Eng., Hong Kong Univ. of Science and Technology, Kowloon, Hong Kong, Hong Kong SAR
Dimitris Papadias

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Böhm, C., Plant, C. (2018). High-Dimensional Indexing. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_804

Download citation

DOI: https://doi.org/10.1007/978-1-4614-8265-9_804
Published: 07 December 2018
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics