Abstract
Among similarity search indexes, the D-index introduced by Gennaro et al. in 2001 is regarded as an efficient metric access method. The performance of this index depends on several parameters, and their optimal configuration remains an open problem. We study two performance issues that occur when the D-index handles high dimensional objects. To solve these problems, we introduce an optimization that simplifies the D-index. By doing this, we remove two configuration parameters and improve performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Müller-Molina, A.J., Shinohara, T.: Fast approximate matching of programs for protecting libre/open source software by using spatial indexes. In: SCAM 2007, pp. 111–122. IEEE Computer Society, Washington (2007)
Dohnal, V., Gennaro, C., Savino, P., Zezula, P.: D-index: Distance searching index for metric data sets. Multimedia Tools Appl. 21(1), 9–33 (2003)
Micó, L., Oncina, J., Vidal, E.: An algorithm for finding nearest neighbours in constant averagetime with a linear space complexity. In: Recognition Methodology and Systems, pp. 557–560 (1992)
Ruiz, E.V.: An algorithm for finding nearest neighbours in (approximately) constant average time. Pattern Recogn. Lett. 4(3), 145–157 (1986)
Shinohara, T., Ishizaka, H.: On dimension reduction mappings for approximate retrieval of multi-dimensional data. In: Progress in Discovery Science, London, UK, pp. 224–231. Springer, Heidelberg (2002)
Filho, R.F.S., Traina, A.J.M., Caetano Traina, J., Faloutsos, C.: Similarity search without tears: The omni family of all-purpose access methods. In: Proceedings of the 17th International Conference on Data Engineering, pp. 623–630. IEEE Computer Society, Washington (2001)
Micó, M.L., Oncina, J., Vidal, E.: A new version of the nearest-neighbour approximating and eliminating search algorithm (aesa) with linear preprocessing time and memory requirements. Pattern Recogn. Lett. 15(1), 9–17 (1994)
Gennaro, C., Savino, P., Zezula, P.: Similarity search in metric databases through hashing. In: MIR 2001, pp. 1–5. ACM, New York (2001)
Yianilos, P.N.: Excluded middle vantage point forests for nearest neighbor search. Technical report, NEC Research Institute, Princeton, NJ (1998)
Müller-Molina, A.J., Shinohara, T.: On approximate matching of programs for protecting libre software. In: CASCON 2006, pp. 275–289. ACM Press, New York (2006)
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. ACM Commun., 422–426 (1970)
Müller-Molina, A.J., Hirata, K., Shinohara, T.: A tree distance function based on multi-sets. In: Chawla, S., Washio, T., Minato, S.-i., Tsumoto, S., Onoda, T., Yamada, S., Inokuchi, A. (eds.) PAKDD 2008. LNCS, vol. 5433, pp. 87–98. Springer, Heidelberg (2009)
Demaine, E., Mosez, S., Rossman, B., Weimann, O.: An optimal decomposition algorithm for tree edit distance. In: ALP, pp. 146–157. Springer, Heidelberg (2007)
Zhang, R., Ooi, B.C., Tan, K.L.: Making the pyramid technique robust to query types and workloads. In: ICDE 2004, p. 313. IEEE Computer Society, Washington (2004)
Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Metric Space Approach. Springer, Secaucus (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Müller-Molina, A.J., Shinohara, T. (2010). On the Configuration of the Similarity Search Data Structure D-Index for High Dimensional Objects. In: Taniar, D., Gervasi, O., Murgante, B., Pardede, E., Apduhan, B.O. (eds) Computational Science and Its Applications – ICCSA 2010. ICCSA 2010. Lecture Notes in Computer Science, vol 6018. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12179-1_37
Download citation
DOI: https://doi.org/10.1007/978-3-642-12179-1_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12178-4
Online ISBN: 978-3-642-12179-1
eBook Packages: Computer ScienceComputer Science (R0)