Skip to main content

The Closest Pair Problem under the Hamming Metric

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5609))

Abstract

Finding the closest pair among a given set of points under Hamming Metric is a fundamental problem with many applications. Let n be the number of points and D the dimensionality of all points. We show that for 0 < D ≤ n 0.294, the problem, with the binary alphabet set, can be solved within time complexity \(O\left(n^{2+o(1)}\right)\), whereas for n 0.294 < D ≤ n, it can be solved within time complexity \(O\left(n^{1.843} D^{0.533}\right)\). We also provide an alternative approach not involving algebraic matrix multiplication, which has the time complexity \(O\left(n^2D/\log^2 D\right)\) with small constant, and is effective for practical use. Moreover, for arbitrary large alphabet set, an algorithm with the time complexity \(O\left(n^2\sqrt{D}\right)\) is obtained for 0 < D ≤ n 0.294, whereas the time complexity is \(O\left(n^{1.921} D^{0.767}\right)\) for n 0.294 < D ≤ n. In addition, the algorithms propose in this paper provides a solution to the open problem stated by Kao et al.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: 47th Annual Symposium on Foundations of Computer Science, pp. 459–468 (2006)

    Google Scholar 

  2. Nanopoulos, A., Theodoridis, Y., Manolopoulos, Y.: C2P: clustering based on closest pairs. In: 27th International Conference on Very Large Data Bases, pp. 331–340 (2001)

    Google Scholar 

  3. Coppersmith, D.: Rectangular matrix multiplication revisited. Journal of Complexity 13(1), 42–49 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  4. Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progressions. In: 19th Annual ACM Symposium on Theory of Computing, pp. 1–6 (1987)

    Google Scholar 

  5. Greene, D., Parnas, M., Yao, F.: Multi-index hashing for information retrieval. In: 35th Annual Symposium on Foundations of Computer Science, pp. 722–731 (1994)

    Google Scholar 

  6. Gordon, D.M., Miller, V., Ostapenko, P.: Optimal hash functions for approximate closest pairs on the n-cube (2008), http://arxiv.org/abs/0806.3284

  7. MacWilliams, F.J., Sloane, N.J.A.: The theory of error-correcting codes. North-Holland Mathematical Library (1977)

    Google Scholar 

  8. Basch, J., Khanna, S., Motwani, R.: On diameter verification and boolean matrix multiplication. Technical Report No. STAN-CS-95-1544, Department of Computer Science, Stanford University (1995)

    Google Scholar 

  9. Atkinson, M.D., Santoro, N.: A practical algorithm for Boolean matrix multiplication. Information Processing Letters 29(1), 37–38 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  10. Kao, M.Y., Sanghi, M., Schweller, R.: Randomized fast design of short DNA words. In: 32nd International Colloquium on Automata, Languages, and Programming, pp. 1275–1286 (2005)

    Google Scholar 

  11. Lipsky, O., Porat, E.: L1 pattern matching lower bound. Information Processing Letters 105(4), 141–143 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  12. Indyk, P., Lewenstein, M., Lipsky, O., Porat, E.: Closest pair problems in very high dimensions. In: 31st International Colloquium on Automata, Languages and Programming, pp. 782–792 (2004)

    Google Scholar 

  13. Indyk, P., Motwani, R.: Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In: 30th Annual Symposium on Theory of Computing, pp. 604–613 (1998)

    Google Scholar 

  14. Karp, R.M., Waarts, O., Zweig, G.: The bit vector intersection problem. In: 36th Annual Symposium on Foundations of Computer Science, pp. 621–630 (1995)

    Google Scholar 

  15. Roth, R.M., Seroussi, G.: Bounds for binary codes with narrow distance distributions. IEEE Transactions on Information Theory 53(8), 2760–2768 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  16. Arlazarov, V.Z., Dinic, E.A., Kronrod, M.A., Faradzev, I.A.: On economical construction of the transitive closure of a directed graph. Dokl. Akad. Nauk 194, 487–488 (1970)

    Google Scholar 

  17. Rytter, W.: Fast recognition of pushdown automaton and context-free languages. Information and Control 67(1-3), 12–22 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  18. Huang, X., Pan, V.Y.: Fast rectangular matrix multiplications and applications. Journal of Complexity 14(2), 257–299 (1998)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Min, K., Kao, MY., Zhu, H. (2009). The Closest Pair Problem under the Hamming Metric. In: Ngo, H.Q. (eds) Computing and Combinatorics. COCOON 2009. Lecture Notes in Computer Science, vol 5609. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02882-3_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02882-3_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02881-6

  • Online ISBN: 978-3-642-02882-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics