Skip to main content

Finding Patterns with Variable Length Gaps or Don’t Cares

  • Conference paper
Computing and Combinatorics (COCOON 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4112))

Included in the following conference series:

Abstract

In this paper we have presented new algorithms to handle the pattern matching problem where the pattern can contain variable length gaps. Given a pattern P with variable length gaps and a text T our algorithm works in O(n + m + α  log(max \(_{\rm 1<={\it i}<={\it l}}\)(b i a i ))) time where n is the length of the text, m is the summation of the lengths of the component subpatterns, α is the total number of occurrences of the component subpatterns in the text and a i and b i are, respectively, the minimum and maximum number of don’t cares allowed between the ith and (i+1)st component of the pattern. We also present another algorithm which, given a suffix array of the text, can report whether P occurs in T in O(m + α loglogn) time. Both the algorithms record information to report all the occurrences of P in T. Furthermore, the techniques used in our algorithms are shown to be useful in many other contexts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aho, A., Corasick, M.: Efficient string matching: an aid to bibliographic search. Communications of the ACM 18, 333–340 (1975)

    Article  MATH  MathSciNet  Google Scholar 

  2. Akutsu, T.: Approximate string matching with variable length don’t care characters. IEICE Trans. Information and Systems E79-D, 1353–1354 (1996)

    Google Scholar 

  3. Amir, A., Lewenstein, M., Porat, E.: Faster algorithms for string matching with k mismatches. In: Proceedings of the Symposium on Discrete Algorithms (SODA 2000), pp. 794–803 (2000)

    Google Scholar 

  4. Baeza-Yates, R., Gonnet, G.: A new approach to text searching. Communications of the ACM 35, 74–82 (1992)

    Article  Google Scholar 

  5. Cole, R., Hariharan, R.: Approximate string matching: a faster simpler algorithm. In: Proceedings of the Symposium on Discrete Algorithms (SODA 1998), pp. 463–472 (1998)

    Google Scholar 

  6. Cole, R., Hariharan, R.: Verifying candidate matches in sparse and wildcard matching. In: Proceedings of the Symposium on Theory of Computing (STOC 2002), pp. 592–601 (2002)

    Google Scholar 

  7. Galil, Z., Giancarlo, R.: Improved string matching with k mismatches. SIGACT News 17(4), 52–54 (1986)

    Article  Google Scholar 

  8. Fischer, M.J., Paterson, M.S.: String matching and other products. Technical report, Massachusetts Institute of Technology, Cambridge, MA (1974)

    Google Scholar 

  9. Gusfield, D.: Algorithms on strings, trees, and sequences. Cambridge University Press, Cambridge (1997)

    Book  MATH  Google Scholar 

  10. Kärkkäinen, J., Sanders, P.: Simple linear work Suffix Array Construction. In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS, vol. 2719, pp. 943–955. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  11. Ko, P., Aluru, S.: Space Efficient Linear Time Construction of Suffix Arrays. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 200–210. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  12. Kim, D.K., Sim, J.S., Park, H., Park, K.: Linear-Time Construction of Suffix Arrays. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 186–199. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  13. Landau, G.M., Vishkin, U.: Efficient string matching with k mismatches. Theoretical Computer Science 43, 239–249 (1986)

    Article  MATH  MathSciNet  Google Scholar 

  14. Landau, G.M., Vishkin, U.: Fast parallel and serial approximate string matching. Journal of Algorithms 10(2), 157–169 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  15. Lee, I., Apostolico, A., Iliopoulos, C.S., Park, K.: Finding approximate occurrence of a pattern that contains gaps. In: Proceedings of the 14th Australasian Workshop on Combinatorial Algorithms (AWOCA 2003), pp. 89–100 (2003)

    Google Scholar 

  16. Navarro, G., Raffinot, M.: Fast and simple character classes and bounded gaps pattern matching. Journal of Computational Biology 10(6), 903–923 (2003)

    Article  Google Scholar 

  17. Sahinalp, S.C., Vishkin, U.: Efficient approximate and dynamic matching of patterns using a labeling paradigm. In: Proceedings of the Symposium on Foundations of Computer Science, pp. 320–328 (1996)

    Google Scholar 

  18. van Emde Boas, P.: Preserving order in a forest in less than logarithmic time and linear space. Information Processing Letters 6, 80–82 (1977)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rahman, M.S., Iliopoulos, C.S., Lee, I., Mohamed, M., Smyth, W.F. (2006). Finding Patterns with Variable Length Gaps or Don’t Cares. In: Chen, D.Z., Lee, D.T. (eds) Computing and Combinatorics. COCOON 2006. Lecture Notes in Computer Science, vol 4112. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11809678_17

Download citation

  • DOI: https://doi.org/10.1007/11809678_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-36925-7

  • Online ISBN: 978-3-540-36926-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics