Skip to main content

Introduction

  • Chapter
  • First Online:
Operators for Similarity Search

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

  • 566 Accesses

Abstract

In this introductory chapter, we consider the operation of common similarity search systems, more from a semantics point of view as opposed to the efficiency-oriented view as used in typical database literature. We illustrate that the full-specification of a similarity search system involves the schema definition as well as details pertaining to the phases of pair-wise similarity estimation and result set identification. We will see how variations in the specification of pairwise similarity estimation and result set identification give rise to various similarity operators. In addition to reviewing the most common similarity operator, the top-k operator, we look at the landscape of similarity operators that have been proposed in the last two decades. We then consider the notion of similarity from a cognitive/psychological perspective and outline some assumptions of similarity measures that form conventional wisdom in such literature. In particular, we focus on those aspects that have implications to building computer-based similarity search systems, and outline some disconnects between the literature in psychology and that in computing pertaining to assumptions made about similarity measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. L. Bentley. Multidimensional binary search trees used for associative searching. Commun. ACM, 18(9):509–517, 1975.

    Google Scholar 

  2. S. Borzsony, D. Kossmann, and K. Stocker. The skyline operator. In Data Engineering, 2001. Proceedings. 17th International Conference on, pages 421–430. IEEE, 2001.

    Google Scholar 

  3. C.-Y. Chan, H. Jagadish, K.-L. Tan, A. K. Tung, and Z. Zhang. On high dimensional skylines. In Advances in Database Technology-EDBT 2006, pages 478–495. Springer, 2006.

    Google Scholar 

  4. Y.-C. Chang, L. Bergman, V. Castelli, C.-S. Li, M.-L. Lo, and J. R. Smith. The onion technique: indexing for linear optimization queries. In ACM SIGMOD Record, volume 29, pages 391–402. ACM, 2000.

    Google Scholar 

  5. E. Dellis and B. Seeger. Efficient computation of reverse skyline queries. In Proceedings of the 33rd international conference on Very large data bases, pages 291–302. VLDB Endowment, 2007.

    Google Scholar 

  6. T. Emrich, M. Franzke, N. Mamoulis, M. Renz, and A. Z¨ufle. Geo-social skyline queries. In Database Systems for Advanced Applications, pages 77–91. Springer, 2014.

    Google Scholar 

  7. R. Fagin and L. Stockmeyer. Relaxing the triangle inequality in pattern matching. International Journal of Computer Vision, 30(3):219–231, 1998.

    Google Scholar 

  8. H. Ferhatosmanoglu, I. Stanoi, D. Agrawal, and A. El Abbadi. Constrained nearest neighbor queries. In Advances in Spatial and Temporal Databases, pages 257–276. Springer, 2001.

    Google Scholar 

  9. R. A. Finkel and J. L. Bentley. Quad trees: A data structure for retrieval on composite keys. Acta Inf., 4:1–9, 1974.

    Google Scholar 

  10. Y. Gao, B. Zheng, G. Chen,W.-C. Lee, K. C. Lee, and Q. Li. Visible reverse k-nearest neighbor queries. In Data Engineering, 2009. ICDE’09. IEEE 25th International Conference on, pages 1203–1206. IEEE, 2009.

    Google Scholar 

  11. G. Gilmore, H. Hersh, A. Caramazza, and J. Griffin. Multidimensional letter similarity derived from recognition errors. Perception & Psychophysics, 25(5):425–431, 1979.

    Google Scholar 

  12. A. Guttman. R-trees: A dynamic index structure for spatial searching. In SIGMOD’84, Proceedings of Annual Meeting, Boston, Massachusetts, June 18-21, 1984, pages 47–57, 1984.

    Google Scholar 

  13. A. Jain, P. Sarda, and J. R. Haritsa. Providing diversity in k-nearest neighbor query results. In Advances in Knowledge Discovery and Data Mining, pages 404–413. Springer, 2004.

    Google Scholar 

  14. W. Jin, J. Han, and M. Ester. Mining thick skylines over large databases. In Knowledge Discovery in Databases: PKDD 2004, pages 255–266. Springer, 2004.

    Google Scholar 

  15. F. Korn and S. Muthukrishnan. Influence sets based on reverse nearest neighbor queries. In ACM SIGMOD Record, volume 29, pages 201–212. ACM, 2000.

    Google Scholar 

  16. Y. Kumar, R. Janardan, and P. Gupta. Efficient algorithms for reverse proximity query problems. In Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems, page 39. ACM, 2008.

    Google Scholar 

  17. C. Li, N. Zhang, N. Hassan, S. Rajasekaran, and G. Das. On skyline groups. In Proceedings of the 21st ACM international conference on Information and knowledge management, pages 2119–2123. ACM, 2012.

    Google Scholar 

  18. X. Lian and L. Chen. Similarity search in arbitrary subspaces under l p-norm. In Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on, pages 317–326. IEEE, 2008.

    Google Scholar 

  19. X. Lin, Y. Yuan, Q. Zhang, and Y. Zhang. Selecting stars: The k most representative skyline operator. In Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on, pages 86–95. IEEE, 2007.

    Google Scholar 

  20. Q. Liu, Y. Gao, G. Chen, Q. Li, and T. Jiang. On efficient reverse k-skyband query processing. In Database Systems for Advanced Applications, pages 544–559. Springer, 2012.

    Google Scholar 

  21. M. M¨uller. Dynamic time warping. Information retrieval for music and motion, pages 69–84, 2007.

    Google Scholar 

  22. S. Nutanong, E. Tanin, and R. Zhang. Visible nearest neighbor queries. In Advances in Databases: Concepts, Systems and Applications, pages 876–883. Springer, 2007.

    Google Scholar 

  23. D. Papadias, Y. Tao, G. Fu, and B. Seeger. Progressive skyline computation in database systems. ACM Transactions on Database Systems (TODS), 30(1):41–82, 2005.

    Google Scholar 

  24. R. Pereira, A. Agshikar, G. Agarwal, and P. Keni. Range reverse nearest neighbor queries. In KICSS, 2013.

    Google Scholar 

  25. P. Podgorny and W. Garner. Reaction time as a measure of inter-and intraobject visual similarity: Letters of the alphabet. Perception & Psychophysics, 26(1):37–52, 1979.

    Google Scholar 

  26. V. S. Ramachandran. The tell-tale brain: A neuroscientist’s quest for what makes us human. WW Norton & Company, 2012.

    Google Scholar 

  27. R. N. Shepard. Toward a universal law of generalization for psychological science. Science, 237(4820):1317–1323, 1987.

    Google Scholar 

  28. Y. Shi and B. Graham. A similarity search approach to solving the multi-query problems. In Computer and Information Science (ICIS), 2012 IEEE/ACIS 11th International Conference on, pages 237–242. IEEE, 2012.

    Google Scholar 

  29. Y. Tao, D. Papadias, and X. Lian. Reverse knn search in arbitrary dimensionality. In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30, pages 744–755. VLDB Endowment, 2004.

    Google Scholar 

  30. A. K. Tung, R. Zhang, N. Koudas, and B. C. Ooi. Similarity search: a matching based approach. In Proceedings of the 32nd international conference on Very large data bases, pages 631–642. VLDB Endowment, 2006.

    Google Scholar 

  31. A. Tversky. Features of similarity. Psychological Reviews, 84(4):327–352, 1977.

    Google Scholar 

  32. A. Tversky and I. Gati. Similarity, separability, and the triangle inequality. Psychological review, 89(2):123, 1982.

    Google Scholar 

  33. R. Yager and F. Petry. Hypermatching: Similarity matching with extreme values. Fuzzy Systems, IEEE Transactions on, 22(4):949–957, Aug 2014.

    Google Scholar 

  34. Z. Zhang, C. Jin, and Q. Kang. Reverse k-ranks query. Proceedings of the VLDB Endowment, 7(10), 2014.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deepak P .

Rights and permissions

Reprints and permissions

Copyright information

© 2015 The Author(s)

About this chapter

Cite this chapter

P, D., Deshpande, P.M. (2015). Introduction. In: Operators for Similarity Search. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-319-21257-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-21257-9_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-21256-2

  • Online ISBN: 978-3-319-21257-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics