Skip to main content

The Abstraction and Instantiation of String-Matching Programs

  • Chapter
  • First Online:
The Essence of Computation

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2566))

Abstract

We consider a naive, quadratic string matcher testing whether a pattern occurs in a text; we equip it with a cache mediating its access to the text; and we abstract the traversal policy of the pattern, the cache, and the text. We then specialize this abstracted program with respect to a pattern, using the off-the-shelf partial evaluator Similix.

Instantiating the abstracted program with a left-to-right traversal policy yields the linear-time behavior of Knuth, Morris and Pratt’s string matcher. Instantiating it with a right-to-left policy yields the linear-time behavior of Boyer and Moore’s string matcher.

Corresponding authors: Torben Amtoft (tamtoft@cis.ksu.edu) and Olivier Danvy (danvy@brics.dk). Extended version available as the BRICS technical report RS- 01-12.

Basic Research in Computer Science (www.brics.dk), funded by the Danish National Research Foundation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Mads Sig Ager, Olivier Danvy, and Henning Korsholm Rohde. On obtaining Knuth, Morris, and Pratt’s string matcher by partial evaluation. In Chin [19], pages 32–46. Extended version available as the technical report BRICS-RS-02-32.

    Google Scholar 

  2. Alfred V. Aho. Algorithms for finding patterns in strings. In Jan van Leeuwen, editor, Handbook of Theoretical Computer Science, volume A, chapter 5, pages 255–300. The MIT Press, 1990.

    Google Scholar 

  3. Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974.

    Google Scholar 

  4. Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. Compilers: Principles, Techniques and Tools. World Student Series. Addison-Wesley, Reading, Massachusetts, 1986.

    Google Scholar 

  5. Maria Alpuente, Moreno Falaschi, Pascual Julián, and German Vidal. Specialization of inductively sequential functional logic programs. In Charles Consel, editor, Proceedings of the ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation, pages 151–162, Amsterdam, The Netherlands, June 1997. ACMPress.

    Google Scholar 

  6. Torben Amtoft. Sharing of Computations. PhD thesis, DAIMI, Department of Computer Science, University of Aarhus, 1993. Technical report PB-453.

    Google Scholar 

  7. Torben Amtoft, Charles Consel, Olivier Danvy, and Karoline Malmkjær. The abstraction and instantiation of string-matching programs. Technical Report BRICS RS-01-12, DAIMI, Department of Computer Science, University of Aarhus, Aarhus, Denmark, April 2001.

    Google Scholar 

  8. Ricardo A. Baeza-Yates, Christian Choffrut, and Gaston H. Gonnet. On Boyer-Moore automata. Algorithmica, 12(4/5):268–292, 1994.

    Article  MATH  MathSciNet  Google Scholar 

  9. Guntis J. Barzdins and Mikhail A. Bulyonkov. Mixed computation and translation: Linearisation and decomposition of compilers. Preprint 791, Computing Centre of Siberian Division of USSR Academy of Sciences, Novosibirsk, Siberia, 1988.

    Google Scholar 

  10. Richard S. Bird. Improving programs by the introduction of recursion. Communications of the ACM, 20(11):856–863, November 1977.

    Article  MATH  MathSciNet  Google Scholar 

  11. Dines Bjørner, Andrei P. Ershov, and Neil D. Jones, editors. Partial Evaluation and Mixed Computation. North-Holland, 1988.

    Google Scholar 

  12. Anselm Blumer, J. Blumer, David Haussler, Andrzej Ehrenfeucht, M. T. Chen, and Joel I. Seiferas. The smallest automaton recognizing the subwords of a text. Theoretical Computer Science, 40:31–55, 1985.

    Article  MATH  MathSciNet  Google Scholar 

  13. Anders Bondorf. Similix 5.1 manual. Technical report, DIKU, Computer Science Department, University of Copenhagen, Copenhagen, Denmark, May 1993. Included in the Similix 5.1 distribution.

    Google Scholar 

  14. Anders Bondorf and Olivier Danvy. Automatic autoprojection of recursive equations with global variables and abstract data types. Science of Computer Programming, 16:151–195, 1991.

    Article  MATH  Google Scholar 

  15. Robert S. Boyer and J. Strother Moore. A fast string searching algorithm. Communications of the ACM, 20(10):762–772, 1977.

    Article  Google Scholar 

  16. Robert S. Boyer and J. Strother Moore. A Computational Logic. ACM Monograph Series. Academic Press, 1979.

    Google Scholar 

  17. Mikhail A. Bulyonkov. Polyvariant mixed computation for analyzer programs. Acta Informatica, 21:473–484, 1984.

    Article  MATH  MathSciNet  Google Scholar 

  18. Christian Charras and Thierry Lecroq. Exact string matching algorithms. http://www-igm.univ-mlv.fr/~lecroq/string/, 1997.

  19. Wei-Ngan Chin, editor. ACM SIGPLAN Asian Symposium on Partial Evaluation and Semantics-Based Program Manipulation, Aizu, Japan, September 2002. ACM Press.

    Google Scholar 

  20. Sandrine Chirokoff, Charles Consel, and Renaud Marlet. Combining program and data specialization. Higher-Order and Symbolic Computation, 12(4):309–335, 1999.

    Article  MATH  Google Scholar 

  21. Livio Colussi. Correctness and efficiency of pattern matching algorithms. Information and Computation, 95:225–251, 1991.

    Article  MATH  MathSciNet  Google Scholar 

  22. Charles Consel and Olivier Danvy. Partial evaluation of pattern matching in strings. Information Processing Letters, 30(2):79–86, January 1989.

    Article  Google Scholar 

  23. Charles Consel and Olivier Danvy. Tutorial notes on partial evaluation. In Susan L. Graham, editor, Proceedings of the Twentieth Annual ACM Symposium on Principles of Programming Languages, pages 493–501, Charleston, South Carolina, January 1993. ACM Press.

    Google Scholar 

  24. Charles Consel, Olivier Danvy, and Karoline Malmkjær. The abstraction and instantiation of string-matching programs. Unpublished manuscript, December 1989, and talks given at Stanford University, Indiana University, Kansas State University, Northeastern University, Harvard, Yale University, and INRIA Rocquencourt.

    Google Scholar 

  25. Charles Consel and François Noël. A general approach for run-time specialization and its application to C. In Guy L. Steele Jr., editor, Proceedings of the Twenty-Third Annual ACM Symposium on Principles of Programming Languages, pages 145–156, St. Petersburg Beach, Florida, January 1996. ACMPress.

    Google Scholar 

  26. Max Crochemore and Christophe Hancart. Pattern matching in strings. In Mikhail J. Atallah, editor, Algorithms and Theory of Computation Handbook, chapter 11. CRC Press, Boca Raton, 1998.

    Google Scholar 

  27. Olivier Danvy and Ulrik P. Schultz. Lambda-dropping: Transforming recursive equations into programs with block structure. Theoretical Computer Science, 248(1–2):243–287, 2000.

    Article  MATH  Google Scholar 

  28. Edsger W. Dijkstra. A Discipline of Programming. Prentice-Hall, 1976.

    Google Scholar 

  29. Andrei P. Ershov, Dines Bjørner, Yoshihiko Futamura, K. Furukawa, Anders Haraldsson, and William Scherlis, editors. Special Issue: Selected Papers from the Workshop on Partial Evaluation and Mixed Computation, 1987, New Generation Computing, Vol. 6,No. 2–3. Ohmsha Ltd. and Springer-Verlag, 1988.

    Google Scholar 

  30. Yoshihiko Futamura, Zenjiro Konishi, and Robert Glück. Automatic generation of efficient string matching algorithms by generalized partial computation. In Chin [19], pages 1–8.

    Google Scholar 

  31. Yoshihiko Futamura, Zenjiro Konishi, and Robert Glück. Program transformation system based on generalized partial computation. New Generation Computing, 20(1):75–99, 2002.

    MATH  Google Scholar 

  32. Yoshihiko Futamura and Kenroku Nogi. Generalized partial computation. In Bjørner et al. [11], pages 133–151.

    Google Scholar 

  33. Robert Glück and Jesper Jsørgensen. Generating optimizing specializers. In Henri Bal, editor, Proceedings of the Fifth IEEE International Conference on Computer Languages, pages 183–194, Toulouse, France, May 1994. IEEE Computer Society Press.

    Google Scholar 

  34. Robert Glück and Andrei Klimov. Occam’s razor in metacomputation: the notion of a perfect process tree. In Patrick Cousot, Moreno Falaschi, Gilberto Filé, and Antoine Rauzy, editors, Proceedings of the Third International Workshop on Static Analysis WSA’93, number 724 in Lecture Notes in Computer Science, pages 112–123, Padova, Italy, September 1993. Springer-Verlag.

    Google Scholar 

  35. Robert Glück and Valentin F. Turchin. Application of metasystem transition to function inversion and transformation. In Proceedings of the international symposium on symbolic and algebraic computation, pages 286–287, Tokyo, Japan, August 1990. ACM, ACM Press.

    Google Scholar 

  36. Bernd Grobauer and Julia L. Lawall. Partial evaluation of pattern matching in strings, revisited. Nordic Journal of Computing, 8(4):437–462, 2002.

    MathSciNet  Google Scholar 

  37. Manuel Hernández and David A. Rosenblueth. Development reuse and the logic program derivation of two string-matching algorithms. In Harald Søndergaard, editor, Proceedings of the Third International Conference on Principles and Practice of Declarative Programming, Firenze, Italy, September 2001. ACM Press. To appear.

    Google Scholar 

  38. Christoph M. Hoffman and Michael J. O’Donnell. Pattern matching in trees. Journal of the ACM, 29(1):68–95, 1982.

    Article  Google Scholar 

  39. R. Nigel Horspool. Practical fast searching in strings. Software-Practice and Experience, 10(6):501–506, 1980.

    Article  Google Scholar 

  40. Neil D. Jones, Carsten K. Gomard, and Peter Sestoft. Partial Evaluation and Automatic Program Generation. Prentice-Hall International, London, UK, 1993. Available online at http://www.dina.kvl.dk/~sestoft/pebook/.

    MATH  Google Scholar 

  41. Neil D. Jones, Peter Sestoft, and Harald Søndergaard. An experiment in partial evaluation: The generation of a compiler generator. In Jean-Pierre Jouannaud, editor, Rewriting Techniques and Applications, number 202 in Lecture Notes in Computer Science, pages 124–140, Dijon, France, May 1985. Springer-Verlag.

    Google Scholar 

  42. Neil D. Jones, Peter Sestoft, and Harald Søndergaard. MIX: A self-applicable partial evaluator for experiments in compiler generation. Lisp and Symbolic Computation, 2(1):9–50, 1989.

    Article  Google Scholar 

  43. Richard Kelsey, William Clinger, and Jonathan Rees, editors. Revised5 report on the algorithmic language Scheme. Higher-Order and Symbolic Computation, 11(1):7–105, 1998.

    Google Scholar 

  44. Todd B. Knoblock and Erik Ruf. Data specialization. In Proceedings of the ACM SIGPLAN’96 Conference on Programming Languages Design and Implementation, SIGPLAN Notices, Vol. 31,No 5, pages 215–225. ACM Press, June 1996.

    Google Scholar 

  45. Donald E. Knuth, James H. Morris, and Vaughan R. Pratt. Fast pattern matching in strings. SIAM Journal on Computing, 6(2):323–350, 1977.

    Article  MATH  MathSciNet  Google Scholar 

  46. Laura Lafave and John P. Gallagher. Constraint-based partial evaluation of rewriting-based functional logic programs. In Norbert E. Fuchs, editor, 7th International Workshop on Program Synthesis and Transformation, number 1463 in Lecture Notes in Computer Science, pages 168–188, Leuven, Belgium, July 1997. Springer-Verlag.

    Chapter  Google Scholar 

  47. Karoline Malmkjær. Program and data specialization: Principles, applications, and self-application. Master’s thesis, DIKU, Computer Science Department, University of Copenhagen, August 1989.

    Google Scholar 

  48. Karoline Malmkjær. Abstract Interpretation of Partial-Evaluation Algorithms. PhD thesis, Department of Computing and Information Sciences, Kansas State University, Manhattan, Kansas, March 1993.

    Google Scholar 

  49. Karoline Malmkjær and Olivier Danvy. Preprocessing by program specialization. In Uffe H. Engberg, Kim G. Larsen, and Peter D. Mosses, editors, Proceedings of the 6th Nordic Workshop on Programming Theory, pages 266–268, Department of Computer Science, University of Aarhus, October 1994. BRICS NS-94-4.

    Google Scholar 

  50. Jonathan Martin and Michael Leuschel. Sonic partial deduction. In Dines Bjørner, Manfred Broy, and Alexander V. Zamulin, editors, Perspectives of System Informatics, Third International Andrei Ershov Memorial Conference, number 1755 in Lecture Notes in Computer Science, pages 101–112, Akademgorodok, Novosibirsk, Russia, July 1999. Springer-Verlag.

    Google Scholar 

  51. Helmuth Partsch and Frank A. Stomp. A fast pattern matching algorithm derived by transformational and assertional reasoning. Formal Aspects of Computing, 2(2):109–122, 1990.

    Article  Google Scholar 

  52. Christian Queinnec and Jean-Marie Geffroy. Partial evaluation applied to pattern matching with intelligent backtrack. In Proceedings of the Second International Workshop on Static Analysis WSA’92, volume 81–82 of Bigre Journal, pages 109–117, Bordeaux, France, September 1992. IRISA, Rennes, France.

    Google Scholar 

  53. Robert Schaback. On the expected sublinearity of the Boyer-Moore algorithm. SIAM Journal on Computing, 17(4):648–658, 1988.

    Article  MATH  MathSciNet  Google Scholar 

  54. Donald A. Smith. Partial evaluation of pattern matching in constraint logic programming languages. In Paul Hudak and Neil D. Jones, editors, Proceedings of the ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation, SIGPLAN Notices, Vol. 26,No 9, pages 62–71, New Haven, Connecticut, June 1991. ACM Press.

    Google Scholar 

  55. Morten Heine Sørensen. Turchin’s supercompiler revisited. an operational theory of positive information propagation. Master’s thesis, DIKU, Computer Science Department, University of Copenhagen, April 1994. DIKU Rapport 94/17.

    Google Scholar 

  56. Morten Heine Sørensen, Robert Glück, and Neil Jones. Towards unifying partial evaluation, deforestation, supercompilation, and GPC. In Donald Sannella, editor, Proceedings of the Fifth European Symposium on Programming, number 788 in Lecture Notes in Computer Science, pages 485–500, Edinburgh, Scotland, April 1994. Springer-Verlag.

    Google Scholar 

  57. Morten Heine Sørensen, Robert Glück, and Neil D. Jones. A positive supercompiler. Journal of Functional Programming, 6(6):811–838, 1996.

    Article  Google Scholar 

  58. Daniel M. Sunday. A very fast substring search algorithm. Communications of the ACM, 33(8):132–142, August 1990.

    Article  Google Scholar 

  59. Masato Takeichi and Yoji Akama. Deriving a functional Knuth-Morris-Pratt algorithm. Journal of Information Processing, 13(4):522–528, 1990.

    MATH  MathSciNet  Google Scholar 

  60. Bruce W. Watson. Taxonomies and Toolkits of Regular Language Algorithms. PhD thesis, Department of Mathematics and Computing Science, Eindhoven University of Technology, Eindhoven, The Netherlands, 1995.

    Google Scholar 

  61. Peter Weiner. Linear pattern matching algorithms. In IEEE Symposium on Switching and Automata Theory, pages 1–11, New York, 1973.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Amtoft, T., Consel, C., Danvy, O., Malmkjær, K. (2002). The Abstraction and Instantiation of String-Matching Programs. In: Mogensen, T.Æ., Schmidt, D.A., Sudborough, I.H. (eds) The Essence of Computation. Lecture Notes in Computer Science, vol 2566. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36377-7_15

Download citation

  • DOI: https://doi.org/10.1007/3-540-36377-7_15

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00326-7

  • Online ISBN: 978-3-540-36377-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics