Skip to main content

Similarity Sketching

  • Reference work entry
  • First Online:
Book cover Encyclopedia of Big Data Technologies
  • 44 Accesses

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 849.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 999.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Andoni A, Indyk P (2008) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun ACM 51(1):117–122

    Article  Google Scholar 

  • Broder AZ (1997) On the resemblance and containment of documents. In: Proceedings of compression and complexity of sequences. IEEE, pp 21–29

    Google Scholar 

  • Broder AZ, Glassman SC, Manasse MS, Zweig G (1997) Syntactic clustering of the web. Comput Netw ISDN Syst 29(8):1157–1166

    Article  Google Scholar 

  • Charikar M (2002) Similarity estimation techniques from rounding algorithms. In: Proceedings of symposium on theory of computing (STOC), pp 380–388

    Google Scholar 

  • Chierichetti F, Kumar R (2015) Lsh-preserving functions and their applications. J ACM 62(5):33

    Article  MathSciNet  MATH  Google Scholar 

  • Dahlgaard S, Knudsen MBT, Thorup M (2017) Fast similarity sketching. In: Proceedings of symposium on foundations of computer science (FOCS), pp 663–671

    Google Scholar 

  • Gionis A, Indyk P, Motwani R (1999) Similarity search in high dimensions via hashing. In: Proceedings of conference on very large databases (VLDB), pp 518–529

    Google Scholar 

  • Jégou H, Douze M, Schmid C (2011) Product quantization for nearest neighbor search. IEEE Trans Pattern Anal Mach Intell 33(1):117–128

    Article  Google Scholar 

  • Li P, König AC (2011) Theory and applications of b-bit minwise hashing. Commun ACM 54(8):101–109

    Article  Google Scholar 

  • Li P, Owen AB, Zhang C (2012) One permutation hashing. In: Advances in neural information processing systems (NIPS), pp 3122–3130

    Google Scholar 

  • Mitzenmacher M, Pagh R, Pham N (2014) Efficient estimation for high similarities using odd sketches. In: Proceedings of international world wide web conference (WWW), pp 109–118

    Google Scholar 

  • Rahimi A, Recht B (2007) Random features for large-scale kernel machines. In: Advances in neural information processing systems (NIPS), pp 1177–1184

    Google Scholar 

  • Thorup M (2013) Bottom-k and priority sampling, set similarity and subset sums with minimal independence. In: Proceedings of symposium on theory of computing (STOC). ACM, pp 371–380

    Google Scholar 

  • Wang J, Zhang T, Song J, Sebe N, Shen HT (2017) A survey on learning to hash. IEEE Trans Pattern Anal Mach Intell 13(9) https://doi.org/10.1109/TPAMI.2017.2699960

    Article  Google Scholar 

Download references

Acknowledgements

This work received support from the European Research Council under the European Union’s 7th Framework Programme (FP7/2007-2013)/ ERC grant agreement no. 614331.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rasmus Pagh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Pagh, R. (2019). Similarity Sketching. In: Sakr, S., Zomaya, A.Y. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-77525-8_58

Download citation

Publish with us

Policies and ethics