Advertisement

Multimedia Tools and Applications

, Volume 21, Issue 1, pp 9–33 | Cite as

D-Index: Distance Searching Index for Metric Data Sets

  • Vlastislav Dohnal
  • Claudio Gennaro
  • Pasquale Savino
  • Pavel Zezula
Article

Abstract

In order to speedup retrieval in large collections of data, index structures partition the data into subsets so that query requests can be evaluated without examining the entire collection. As the complexity of modern data types grows, metric spaces have become a popular paradigm for similarity retrieval. We propose a new index structure, called D-Index, that combines a novel clustering technique and the pivot-based distance searching strategy to speed up execution of similarity range and nearest neighbor queries for large files with objects stored in disk memories. We have qualitatively analyzed D-Index and verified its properties on actual implementation. We have also compared D-Index with other index structures and demonstrated its superiority on several real-life data sets. Contrary to tree organizations, the D-Index structure is suitable for dynamic environments with a high rate of delete/insert operations.

metric spaces similarity search index structures performance evaluation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    T. Bozkaya and Ozsoyoglu, “Indexing large metric spaces for similarity search queries,” ACM TODS, Vol. 24, No. 3, pp. 361–404, 1999.Google Scholar
  2. 2.
    B. Bustos, G. Navarro, and E. Chavez, “Pivot selection techniques for proximity searching in metric spaces,” in Proceedings of the XXI Conference of the Chielan Computer Science Society (SCCC01), IEEE CS Press, 2001, pp. 33–40.Google Scholar
  3. 3.
    E. Chavez, J. Marroquin, and G. Navarro, “Fixed queries array: A fast and economical data structure for proximity searching,” Multimedia Tools and Applications, Vol. 14, No. 2, pp. 113–135, 2001.Google Scholar
  4. 4.
    E. Chavez, G. Navarro, R. Baeza-Yates, and J. Marroquin, “Proximity searching in metric spaces,” ACM Computing Surveys. Vol. 33, No. 3, pp. 273–321, 2001.Google Scholar
  5. 5.
    P. Ciaccia, M. Patella, and P. Zezula, “M-tree: An efficient access method for similarity search in metric spaces,” in Proceedings of the 23rd VLDB Conference, Athens, Greece, 1997, pp. 426–435.Google Scholar
  6. 6.
    R.F.S. Filho, A. Traina, C. Traina Jr., and C. Faloutsos, “Similarity search without tears: The OMNI-family of all-purpose access methods,” in Proceedings of the 17th ICDE Conference, Heidelberg, Germany, 2001, pp. 623–630.Google Scholar
  7. 7.
    V. Dohnal, C. Gennaro, P. Savino, and P. Zezula, “Separable splits in metric data sets,” in Proceedings of 9-th Italian Symposium on Advanced Database Systems, Venice, Italy, June 2001, pp. 45–62, LCM Selecta Group—Milano.Google Scholar
  8. 8.
    C. Gennaro, P. Savino, and P. Zezula, “Similarity search in metric databases through Hashing,” in Proceedings of ACM Multimedia 2001 Workshops, Oct. 2001, Ottawa, Canada, pp. 1–5.Google Scholar
  9. 9.
    J.M. Hellerstein, J.F. Naughton, and A. Pfeffer, “Generalized search trees for database systems,” in Proceedings of the 21st VLDB Conference, 1995, pp. 562–573.Google Scholar
  10. 10.
    B. Seeger, P. Larson, and R. McFayden, “Reading a set of disk pages,” in Proceedings of the 19th VLDB Conference, 1993, pp. 592–603.Google Scholar
  11. 11.
    P.N. Yianilos, “Data structures and algorithms for nearest neighbor search in general metric spaces,” ACMSIAM Symposium on Discrete Algorithms (SODA), 1993, pp. 311–321.Google Scholar
  12. 12.
    P.N. Yianilos, “Excluded middle vantage point forests for nearest neighbor search,” Tech. rep., NEC Research Institute, 1999, Presented at Sixth DIMACS Implementation Challenge: Nearest Neighbor Searchesworkshop, Jan. 15, 1999.Google Scholar
  13. 13.
    C. Yu, B.C. Ooi, K.L. Tan, and H.V. Jagadish, “Indexing the Distance: Anefficient method toKNNprocessing,” in Proceedings of the 27th VLDB Conference, Roma, Italy, 2001, pp. 421–430.Google Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • Vlastislav Dohnal
    • 1
  • Claudio Gennaro
    • 2
  • Pasquale Savino
    • 2
  • Pavel Zezula
    • 1
  1. 1.Masaryk University BrnoCzech Republic
  2. 2.ISI-CNRPisaItaly

Personalised recommendations