Introduction

P, Deepak; Deshpande, Prasad M.

doi:10.1007/978-3-319-21257-9_1

Deepak P¹⁷ &
Prasad M. Deshpande¹⁷

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

566 Accesses

Abstract

In this introductory chapter, we consider the operation of common similarity search systems, more from a semantics point of view as opposed to the efficiency-oriented view as used in typical database literature. We illustrate that the full-specification of a similarity search system involves the schema definition as well as details pertaining to the phases of pair-wise similarity estimation and result set identification. We will see how variations in the specification of pairwise similarity estimation and result set identification give rise to various similarity operators. In addition to reviewing the most common similarity operator, the top-k operator, we look at the landscape of similarity operators that have been proposed in the last two decades. We then consider the notion of similarity from a cognitive/psychological perspective and outline some assumptions of similarity measures that form conventional wisdom in such literature. In particular, we focus on those aspects that have implications to building computer-based similarity search systems, and outline some disconnects between the literature in psychology and that in computing pertaining to assumptions made about similarity measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 16.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

J. L. Bentley. Multidimensional binary search trees used for associative searching. Commun. ACM, 18(9):509–517, 1975.
Google Scholar
S. Borzsony, D. Kossmann, and K. Stocker. The skyline operator. In Data Engineering, 2001. Proceedings. 17th International Conference on, pages 421–430. IEEE, 2001.
Google Scholar
C.-Y. Chan, H. Jagadish, K.-L. Tan, A. K. Tung, and Z. Zhang. On high dimensional skylines. In Advances in Database Technology-EDBT 2006, pages 478–495. Springer, 2006.
Google Scholar
Y.-C. Chang, L. Bergman, V. Castelli, C.-S. Li, M.-L. Lo, and J. R. Smith. The onion technique: indexing for linear optimization queries. In ACM SIGMOD Record, volume 29, pages 391–402. ACM, 2000.
Google Scholar
E. Dellis and B. Seeger. Efficient computation of reverse skyline queries. In Proceedings of the 33rd international conference on Very large data bases, pages 291–302. VLDB Endowment, 2007.
Google Scholar
T. Emrich, M. Franzke, N. Mamoulis, M. Renz, and A. Z¨ufle. Geo-social skyline queries. In Database Systems for Advanced Applications, pages 77–91. Springer, 2014.
Google Scholar
R. Fagin and L. Stockmeyer. Relaxing the triangle inequality in pattern matching. International Journal of Computer Vision, 30(3):219–231, 1998.
Google Scholar
H. Ferhatosmanoglu, I. Stanoi, D. Agrawal, and A. El Abbadi. Constrained nearest neighbor queries. In Advances in Spatial and Temporal Databases, pages 257–276. Springer, 2001.
Google Scholar
R. A. Finkel and J. L. Bentley. Quad trees: A data structure for retrieval on composite keys. Acta Inf., 4:1–9, 1974.
Google Scholar
Y. Gao, B. Zheng, G. Chen,W.-C. Lee, K. C. Lee, and Q. Li. Visible reverse k-nearest neighbor queries. In Data Engineering, 2009. ICDE’09. IEEE 25th International Conference on, pages 1203–1206. IEEE, 2009.
Google Scholar
G. Gilmore, H. Hersh, A. Caramazza, and J. Griffin. Multidimensional letter similarity derived from recognition errors. Perception & Psychophysics, 25(5):425–431, 1979.
Google Scholar
A. Guttman. R-trees: A dynamic index structure for spatial searching. In SIGMOD’84, Proceedings of Annual Meeting, Boston, Massachusetts, June 18-21, 1984, pages 47–57, 1984.
Google Scholar
A. Jain, P. Sarda, and J. R. Haritsa. Providing diversity in k-nearest neighbor query results. In Advances in Knowledge Discovery and Data Mining, pages 404–413. Springer, 2004.
Google Scholar
W. Jin, J. Han, and M. Ester. Mining thick skylines over large databases. In Knowledge Discovery in Databases: PKDD 2004, pages 255–266. Springer, 2004.
Google Scholar
F. Korn and S. Muthukrishnan. Influence sets based on reverse nearest neighbor queries. In ACM SIGMOD Record, volume 29, pages 201–212. ACM, 2000.
Google Scholar
Y. Kumar, R. Janardan, and P. Gupta. Efficient algorithms for reverse proximity query problems. In Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems, page 39. ACM, 2008.
Google Scholar
C. Li, N. Zhang, N. Hassan, S. Rajasekaran, and G. Das. On skyline groups. In Proceedings of the 21st ACM international conference on Information and knowledge management, pages 2119–2123. ACM, 2012.
Google Scholar
X. Lian and L. Chen. Similarity search in arbitrary subspaces under l p-norm. In Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on, pages 317–326. IEEE, 2008.
Google Scholar
X. Lin, Y. Yuan, Q. Zhang, and Y. Zhang. Selecting stars: The k most representative skyline operator. In Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on, pages 86–95. IEEE, 2007.
Google Scholar
Q. Liu, Y. Gao, G. Chen, Q. Li, and T. Jiang. On efficient reverse k-skyband query processing. In Database Systems for Advanced Applications, pages 544–559. Springer, 2012.
Google Scholar
M. M¨uller. Dynamic time warping. Information retrieval for music and motion, pages 69–84, 2007.
Google Scholar
S. Nutanong, E. Tanin, and R. Zhang. Visible nearest neighbor queries. In Advances in Databases: Concepts, Systems and Applications, pages 876–883. Springer, 2007.
Google Scholar
D. Papadias, Y. Tao, G. Fu, and B. Seeger. Progressive skyline computation in database systems. ACM Transactions on Database Systems (TODS), 30(1):41–82, 2005.
Google Scholar
R. Pereira, A. Agshikar, G. Agarwal, and P. Keni. Range reverse nearest neighbor queries. In KICSS, 2013.
Google Scholar
P. Podgorny and W. Garner. Reaction time as a measure of inter-and intraobject visual similarity: Letters of the alphabet. Perception & Psychophysics, 26(1):37–52, 1979.
Google Scholar
V. S. Ramachandran. The tell-tale brain: A neuroscientist’s quest for what makes us human. WW Norton & Company, 2012.
Google Scholar
R. N. Shepard. Toward a universal law of generalization for psychological science. Science, 237(4820):1317–1323, 1987.
Google Scholar
Y. Shi and B. Graham. A similarity search approach to solving the multi-query problems. In Computer and Information Science (ICIS), 2012 IEEE/ACIS 11th International Conference on, pages 237–242. IEEE, 2012.
Google Scholar
Y. Tao, D. Papadias, and X. Lian. Reverse knn search in arbitrary dimensionality. In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30, pages 744–755. VLDB Endowment, 2004.
Google Scholar
A. K. Tung, R. Zhang, N. Koudas, and B. C. Ooi. Similarity search: a matching based approach. In Proceedings of the 32nd international conference on Very large data bases, pages 631–642. VLDB Endowment, 2006.
Google Scholar
A. Tversky. Features of similarity. Psychological Reviews, 84(4):327–352, 1977.
Google Scholar
A. Tversky and I. Gati. Similarity, separability, and the triangle inequality. Psychological review, 89(2):123, 1982.
Google Scholar
R. Yager and F. Petry. Hypermatching: Similarity matching with extreme values. Fuzzy Systems, IEEE Transactions on, 22(4):949–957, Aug 2014.
Google Scholar
Z. Zhang, C. Jin, and Q. Kang. Reverse k-ranks query. Proceedings of the VLDB Endowment, 7(10), 2014.
Google Scholar

Download references

Author information

Authors and Affiliations

IBM Research, Bangalore, India
Deepak P & Prasad M. Deshpande

Authors

Deepak P
View author publications
You can also search for this author in PubMed Google Scholar
Prasad M. Deshpande
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Deepak P .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

P, D., Deshpande, P.M. (2015). Introduction. In: Operators for Similarity Search. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-319-21257-9_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-21257-9_1
Published: 08 July 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21256-2
Online ISBN: 978-3-319-21257-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics