Advertisement

Locality-Sensitive Hashing Without False Negatives for \(l_p\)

  • Andrzej Pacuk
  • Piotr Sankowski
  • Karol Wegrzycki
  • Piotr WygockiEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9797)

Abstract

In this paper, we show a construction of locality-sensitive hash functions without false negatives, i.e., which ensure collision for every pair of points within a given radius R in d dimensional space equipped with \(l_p\) norm when \(p \in [1,\infty ]\). Furthermore, we show how to use these hash functions to solve the c-approximate nearest neighbor search problem without false negatives. Namely, if there is a point at distance R, we will certainly report it and points at distance greater than cR will not be reported for \(c=\varOmega (\sqrt{d},d^{1-\frac{1}{p}})\). The constructed algorithms work:
  • with preprocessing time \(\mathcal {O}(n \log (n))\) and sublinear expected query time,

  • with preprocessing time \(\mathcal {O}(\mathrm {poly}(n))\) and expected query time \(\mathcal {O}(\log (n))\).

Our paper reports progress on answering the open problem presented by Pagh [8], who considered the nearest neighbor search without false negatives for the Hamming distance.

Notes

Acknowledgments

This work was supported by ERC PoC project PAAl-POC 680912 and FET project MULTIPLEX 317532. We would also like to thank Rafał Latała for meaningful discussions.

References

  1. 1.
    Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM 51(1), 117–122 (2008)CrossRefGoogle Scholar
  2. 2.
    Andoni, A., Razenshteyn, I.: Optimal data-dependent hashing for approximate near neighbors. In: Servedio, R.A., Rubinfeld, R. (eds.) Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC 2015, Portland, OR, USA, 14–17 June 2015, pp. 793–801. ACM (2015)Google Scholar
  3. 3.
    Bentley, J.L.: K-d trees for semidynamic point sets. In: Proceedings of the Sixth Annual Symposium on Computational Geometry, SCG 1990, pp. 187–197. ACM, New York (1990)Google Scholar
  4. 4.
    Datar, M., Indyk, P.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the Twentieth Annual Symposium on Computational Geometry, SCG 2004, pp. 253–262. ACM Press (2004)Google Scholar
  5. 5.
    Haagerup, U.: The best constants in the Khintchine inequality. Stud. Math. 70(3), 231–283 (1981)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, STOC 1998, pp. 604–613. ACM, New York (1998)Google Scholar
  8. 8.
    Pagh, R.: Locality-sensitive hashing without false negatives. In: Krauthgamer, R. (ed.) Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, 10–12 January 2016, pp. 1–9. SIAM (2016)Google Scholar
  9. 9.
    Veraar, M.: On Khintchine inequalities with a weight. Proc. Am. Math. Soc. 138, 4119–4121 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Williams, R.: A new algorithm for optimal 2-constraint satisfaction and its implications. Theor. Comput. Sci. 348(2), 357–365 (2005)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Andrzej Pacuk
    • 1
  • Piotr Sankowski
    • 1
  • Karol Wegrzycki
    • 1
  • Piotr Wygocki
    • 1
    Email author
  1. 1.Institute of InformaticsUniversity of WarsawWarsawPoland

Personalised recommendations