Skip to main content

Improved Sketching of Hamming Distance with Error Correcting

  • Conference paper
Book cover Combinatorial Pattern Matching (CPM 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4580))

Included in the following conference series:

Abstract

We address the problem of sketching the hamming distance of data streams. We develop Fixable Sketches which compare data streams or files and restore the differences between them. Our contribution: For two streams with hamming distance bounded by k we show a sketch of size O(klogn) with O(logn) processing time per new element in the stream and how to restore all locations where the two streams differ in time linear in the sketch size. Probability of error is less than 1/n.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bar-Yossef, Z., Jayram, T.S., Krauthgamer, R., Kumar, R.: Approximating edit distance efficiently. In: FOCS, pp. 550–559. IEEE Computer Society Press, Los Alamitos (2004)

    Google Scholar 

  2. Bar-Yossef, Z., Jayram, T.S, Kumar, R., Sivakumar, D.: Manuscript (2003)

    Google Scholar 

  3. Batu, T., Ergün, F., Kilian, J., Magen, A., Raskhodnikova, S., Rubinfeld, R., Sami, R.: A sublinear algorithm for weakly approximating edit distance. In: STOC, pp. 316–324. ACM, New York (2003)

    Google Scholar 

  4. Cormode, G., Datar, M., Indyk, P., Muthukrishnan, S.: Comparing data streams using hamming norms (how to zero in). IEEE Trans. Knowl. Data Eng. 15(3), 529–540 (2003)

    Article  Google Scholar 

  5. Cormode, G., Paterson, M., Sahinalp, S.C, Vishkin, U.: Communication complexity of document exchange. In: SODA ’00: Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, pp. 197–206. Society for Industrial and Applied Mathematics (2000)

    Google Scholar 

  6. Feigenbaum, J., Ishai, Y., Malkin, T., Nissim, K., Strauss, M., Wright, R.: Secure multiparty computation of approximations. In: Orejas, F., Spirakis, P.G., van Leeuwen, J. (eds.) ICALP 2001. LNCS, vol. 2076, pp. 927–938. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  7. Feigenbaum, J., Kannan, S., Strauss, M., Viswanathan, M.: An approximate l1-difference algorithm for massive data streams. SIAM J. Comput (and in Proceedings of the 40th Annual Symposium on Foundations of Computer Science), 32(1) 131–151, (2002) Appeared in Proceedings of the 40th Annual Symposium on Foundations of Computer Science, pp. 501–511 (1999)

    Google Scholar 

  8. Gilbert, A.C, Guha, S., Indyk, P., Kotidis, Y., Muthukrishnan, S., Strauss, M.: Fast, small-space algorithms for approximate histogram maintenance. In: STOC 2002: Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pp. 389–398. ACM Press, New York (2002)

    Chapter  Google Scholar 

  9. Guha, S., Koudas, N., Shim, K.: Data-streams and histograms. In: STOC 2001: Proceedings of the thirty-third annual ACM symposium on Theory of computing, pp. 471–475. ACM Press, New York (2001)

    Chapter  Google Scholar 

  10. Indyk, P.: Stable distributions, pseudorandom generators, embeddings and data stream computation. In: FOCS 2000: Proceedings of the 41st Annual Symposium on Foundations of Computer Science, Washington, DC, USA, p. 189. IEEE Computer Society Press, Los Alamitos (2000)

    Chapter  Google Scholar 

  11. Kushilevitz, E., Ostrovsky, R., Rabani, Y.: Efficient search for approximate nearest neighbor in high dimensional spaces. SIAM J. Comput. 30(2), 457–474 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  12. Muthukrishnan, S.: Data streams: algorithms and applications. In: SODA ’03: Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 413–413, Philadelphia, PA, USA, Society for Industrial and Applied Mathematics (2003)

    Google Scholar 

  13. Starobinski, D., Trachtenberg, A., Agarwal, S.: Efficient pda synchronization. IEEE Trans. Mob. Comput. 2(1), 40–51 (2003)

    Article  Google Scholar 

  14. Trachtenberg, A., Starobinski, D., Agarwal, S.: Fast pda synchronization using characteristic polynomial interpolation. In: INFOCOM (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Bin Ma Kaizhong Zhang

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Porat, E., Lipsky, O. (2007). Improved Sketching of Hamming Distance with Error Correcting. In: Ma, B., Zhang, K. (eds) Combinatorial Pattern Matching. CPM 2007. Lecture Notes in Computer Science, vol 4580. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73437-6_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73437-6_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73436-9

  • Online ISBN: 978-3-540-73437-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics