Synonyms
Indexing for similarity search
Definition
The term high-dimensional indexing [6, 9] subsumes all techniques for indexing vector spaces addressing problems which are specific in the context of high-dimensional data spaces, and all optimization techniques to improve index structures, and the algorithms for various variants of similarity search (nearest neighbor, reverse nearest neighbor queries, range queries, similarity joins, etc.) for high-dimensional spaces. The well-known curse of dimensionality leads to a worsening of the index selectivity with increasing dimensionality of the data space, an effect which already starts at dimensions of 10–15, also depending on the size of the database and the data distribution (clustering, attribute dependencies). During query processing, large parts of conventional hierarchical indexes (e.g., R-tree) need to be randomly accessed, which is by a factor of up to 20 more expensive than sequential reading operations. Therefore, specialized...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Berchtold S, Böhm C, Kriegel H-P. The pyramid-technique: towards breaking the curse of dimensionality. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1998. p. 142–53.
Berchtold S, Böhm C, Jagadish HV, Kriegel HP, Sander J. Independent quantization: an index compression technique for high-dimensional data spaces. In: Proceedings of the 16th International Conference on Data Engineering; 2000. p. 577–88.
Berchtold S, Böhm C, Keim DA, Kriegel H-P. A cost model for nearest neighbor search in high-dimensional data space. In: Proceedings of the 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 1997. p. 78–86.
Berchtold S, Böhm C, Keim DA, Kriegel H-P, Xu X. Optimal multidimensional query processing using tree striping. In: Proceedings of the 2nd International Conference Data Warehousing and Knowledge Discovery; 2000. p. 244–57.
Berchtold S, Keim DA, Kriegel H-P. The x-tree: an index structure for high-dimensional data. In: Proceedings of the 22nd International Conference on Very Large Data Bases; 1996. p. 28–39.
Beyer KS, Goldstein J, Ramakrishnan R, Shaft U. When is “nearest neighbor” meaningful? In: Proceedings of the 7th International Conference on Database Theory; 1999. p. 217–35.
Böhm C. A cost model for query processing in high dimensional data spaces. ACM Trans Database Syst. 2000;25(2):129–78.
Böhm C, Kriegel H-P. Dynamically optimizing high-dimensional index structures. In: Advances in Database Technology, Proceedings of the 7th International Conference on Extending Database Technology; 2000. p. 36–50.
Böhm C, Berchtold S, Keim DA. Searching in high-dimensional spaces: index structures for improving the performance of multimedia databases. ACM Comput Surv. 2001;33(3):322–73.
Chang Y-C, Bergman LD, Castelli V, Li C-S, Lo M-L, Smith JR. The onion technique: indexing for linear optimization queries. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2000. p. 391–402.
Cui B, Ooi BC, Su J, Tan KL. Indexing high-dimensional data for efficient in-memory similarity search. IEEE Trans Knowl Data Eng (TKDE). 2005;17(3):339–53.
Ferhatosmanoglu H, Agrawal D, Abbadi AE. Concentric hyperspaces and disk allocation for fast parallel range searching. In: Proceedings of the 15th International Conference on Data Engineering; 1999. p. 608–15.
Günnemann S, Kremer H, Lenhard D Seidl T. Subspace clustering for indexing high dimensional data: a main memory index based on local reductions and individual multi-representations. In: Proceedings of the International Conference on Extending Database Technology; 2011. p. 237–48.
Guttman A. R-trees: a dynamic index structure for spatial searching. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1984. p. 47–57.
Heisterkamp DR, Peng J. Kernel vector approximation files for relevance feedback retrieval in large image databases. Multimed Tools Appl. 2005;26(2):175–89.
Jin H, Ooi BC, Shen HT, Yu C, Zhou A. An adaptive and efficient dimensionality reduction algorithm for high-dimensional indexing. In: Proceedings of the 19th International Conference on Data Engineering; 2003. p. 87–98.
Katayama N, Satoh S. The SR-tree: an index structure for high-dimensional nearest neighbor queries. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1997. p. 369–80.
Kim C, Chhugani J, Satish N, Sedlar E, Nguyen AD, Kaldewey T, Lee VW, Brandt SA, Dubey P. FAST: fast architecture sensitive tree search on modern CPUs and GPUs. In: Proceeding of the ACM SIGMOD International Conference on Management of Data; 2010. p. 339–50.
Leis V, Kemper A, Neumann T. The adaptive radix tree: ARTful indexing for main-memory databases. In: Proceedings of the International Conference on Data Engineering; 2013. p. 38–49.
Levandoski JJ, Lomet DB Sengupta S. The Bw-Tree: a B-tee for new hardware platforms. In: Proceedings of the 29th International Conference on Data Engineering; 2013. p. 302–13.
Lin K-I, Jagadish HV, Faloutsos C. The tv-tree: an index structure for high-dimensional data. VLDB J. 1994;3(4):517–42.
Moise D, Shestakov D, Gudmundsson G, Amsaleg A. Indexing and searching 100M images with map-reduce. In: Proceedings of the 3rd ACM International Conference on Multimedia Retrieval; 2013. p. 17–24.
Sakurai Y, Yoshikawa M, Uemura S, Kojima H. The A-tree: an index structure for high-dimensional spaces using relative approximation. In: Proceedings of the 26th International Conference on Very Large Data Bases; 2000. p. 516–26.
Weber R, Böhm K, Schek H-J. Interactive-time similarity search for large image collections using parallel VA-files. In: Proceedings of the 4th European Conference Research and Advanced Technology for Digital Libraries; 2000. p. 83–92.
Weber R, Schek H-J, Blott S. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of the 24th International Conference on Very Large Data Bases; 1998. p. 194–205.
White DA, Jain R. Similarity indexing with the ss-tree. In: Proceedings of the 12th International Conference on Data Engineering; 1996. p. 516–23.
Yu C, Ooi BC, Tan K-L, Jagadish HV. Indexing the distance: an efficient method to KNN processing. In: Proceedings of the 27th International Conference on Very Large Data Bases; 2001. p. 421–30.
Wang J, Wu S, Gao H, Li J, Ooi B.C. Indexing multi-dimensional data in a cloud system. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2010. p. 591–602.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Böhm, C., Plant, C. (2018). High-Dimensional Indexing. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_804
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_804
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering