Skip to main content

Clustering-Based Similarity Search in Metric Spaces with Sparse Spatial Centers

  • Conference paper
SOFSEM 2008: Theory and Practice of Computer Science (SOFSEM 2008)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4910))

Abstract

Metric spaces are a very active research field which offers efficient methods for indexing and searching by similarity in large data sets. In this paper we present a new clustering-based method for similarity search called SSSTree. Its main characteristic is that the centers of each cluster are selected using Sparse Spatial Selection (SSS), a technique initially developed for the selection of pivots. SSS is able to adapt the set of selected points (pivots or cluster centers) to the intrinsic dimensionality of the space. Using SSS, the number of clusters in each node of the tree depends on the complexity of the subspace it represents. The space partition in each node will be made depending on that complexity, improving thus the performance of the search operation. In this paper we present this new method and provide experimental results showing that SSSTree performs better than previously proposed indexes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Computing Surveys 33, 273–321 (2001)

    Article  Google Scholar 

  2. Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity search. The metric space approach 32 (2006)

    Google Scholar 

  3. Burkhard, W.A., Keller, R.M.: Some approaches to best-match file searching. Communications of the ACM 16, 230–236 (1973)

    Article  MATH  Google Scholar 

  4. Baeza-Yates, R., Cunto, W., Manber, U., Wu, S.: Proximity matching using fixed-queries trees. In: Crochemore, M., Gusfield, D. (eds.) CPM 1994. LNCS, vol. 807, pp. 198–212. Springer, Heidelberg (1994)

    Google Scholar 

  5. Baeza-Yates, R.: Searching: an algorithmic tour. Encyclopedia of Computer Science and Technology 37, 331–359 (1997)

    Google Scholar 

  6. Chávez, E., Marroquín, J.L., Navarro, G.: Overcoming the curse of dimensionality. In: CBMI 1999. European Workshop on Content-based Multimedia Indexing, pp. 57–64 (1999)

    Google Scholar 

  7. Yianilos, P.: Data structures and algorithms for nearest-neighbor search in general metric space. In: Proceedings of the fourth annual ACM-SIAM Symposium on Discrete Algorithms, pp. 311–321 (1993)

    Google Scholar 

  8. Bozkaya, T., Ozsoyoglu, M.: Distance-based indexing for high-dimensional metric spaces. In: SIGMOD 1997. Proceedings of the ACM International Conference on Management of Data, pp. 357–368 (1997)

    Google Scholar 

  9. Yianilos, P.: Excluded middle vantage point forests for nearest neighbor search. In: Goodrich, M.T., McGeoch, C.C. (eds.) ALENEX 1999. LNCS, vol. 1619, Springer, Heidelberg (1999)

    Google Scholar 

  10. Vidal, E.: An algorithm for finding nearest neighbors in (aproximately) constant average time. Pattern Recognition Letters 4, 145–157 (1986)

    Article  Google Scholar 

  11. Micó, L., Oncina, J., Vidal, R.E.: A new version of the nearest-neighbor approximating and eliminating search (aesa) with linear pre-processing time and memory requirements. Pattern Recognition Letters 15, 9–17 (1994)

    Article  Google Scholar 

  12. Kalantari, I., McDonald, G.: A data structure and an algorithm for the nearest point problem. IEEE Transactions on Software Engineering 9, 631–634 (1983)

    Article  Google Scholar 

  13. Uhlmann, J.K.: Satisfying general proximity/similarity queries with metric trees. Information Processing Letters 40, 175–179 (1991)

    Article  MATH  Google Scholar 

  14. Brin, S.: Near neighbor search in large metric spaces. In: 21st conference on Very Large Databases (1995)

    Google Scholar 

  15. Brisaboa, N.R., Pedreira, O.: Spatial selection of sparse pivots for similarity search in metric spaces. In: van Leeuwen, J., Italiano, G.F., van der Hoek, W., Meinel, C., Sack, H., Plášil, F. (eds.) SOFSEM 2007. LNCS, vol. 4362, pp. 434–445. Springer, Heidelberg (2007)

    Google Scholar 

  16. Uribe, R., Navarro, G., Barrientos, R.J., Marín, M.: An index data structure for searching in metric space databases. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J.J. (eds.) ICCS 2006. LNCS, vol. 3991, pp. 611–617. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  17. Bustos, B., Navarro, G., Chávez, E.: Pivot selection techniques for proximity searching in metric spaces. Pattern Recognition Letters 24(14), 2357–2366 (2003)

    Article  MATH  Google Scholar 

  18. Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: VLDB 1997. Proceedings of the 23rd International Conference on Very Large Data Bases, pp. 426–435 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Viliam Geffert Juhani Karhumäki Alberto Bertoni Bart Preneel Pavol Návrat Mária Bieliková

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Brisaboa, N., Pedreira, O., Seco, D., Solar, R., Uribe, R. (2008). Clustering-Based Similarity Search in Metric Spaces with Sparse Spatial Centers. In: Geffert, V., Karhumäki, J., Bertoni, A., Preneel, B., Návrat, P., Bieliková, M. (eds) SOFSEM 2008: Theory and Practice of Computer Science. SOFSEM 2008. Lecture Notes in Computer Science, vol 4910. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77566-9_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-77566-9_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-77565-2

  • Online ISBN: 978-3-540-77566-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics