Advertisement

Bounded Seas

— Island Parsing Without Shipwrecks
  • Jan Kurš
  • Mircea Lungu
  • Oscar Nierstrasz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8706)

Abstract

Imprecise manipulation of source code (semi-parsing) is useful for tasks such as robust parsing, error recovery, lexical analysis, and rapid development of parsers for data extraction. An island grammar precisely defines only a subset of a language syntax (islands), while the rest of the syntax (water) is defined imprecisely.

Usually, water is defined as the negation of islands. Albeit simple, such a definition of water is naive and impedes composition of islands. When developing an island grammar, sooner or later a programmer has to create water tailored to each individual island. Such an approach is fragile, however, because water can change with any change of a grammar. It is time-consuming, because water is defined manually by a programmer and not automatically. Finally, an island surrounded by water cannot be reused because water has to be defined for every grammar individually.

In this paper we propose a new technique of island parsing — bounded seas. Bounded seas are composable, robust, reusable and easy to use because island-specific water is created automatically. We integrated bounded seas into a parser combinator framework as a demonstration of their composability and reusability.

Keywords

Character Class Context Free Grammar Input String Composability Problem Extended Semantic 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Moonen, L.: Generating robust parsers using island grammars. In: Burd, E., Aiken, P., Koschke, R. (eds.) Proceedings Eight Working Conference on Reverse Engineering (WCRE 2001), pp. 13–22. IEEE Computer Society (2001), doi:doi:10.1109/WCRE.2001.957806Google Scholar
  2. 2.
    Renggli, L., Ducasse, S., Gîrba, T., Nierstrasz, O.: Practical dynamic grammars for dynamic languages. In: 4th Workshop on Dynamic Languages and Applications (DYLA 2010), Malaga, Spain (2010)Google Scholar
  3. 3.
    Hutton, G., Meijer, E.: Monadic parser combinators, Tech. Rep. NOTTCS-TR-96-4, Department of Computer Science, University of Nottingham (1996)Google Scholar
  4. 4.
    Frost, R., Launchbury, J.: Constructing natural language interpreters in a lazy functional language. Comput. J. 32(2), 108–121 (1989), doi:doi:10.1093/comjnl/32.2.108Google Scholar
  5. 5.
    Ford, B.: Parsing expression grammars: a recognition-based syntactic foundation. In: POPL 2004: Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 111–122. ACM, New York (2004), doi:10.1145/964001.964011Google Scholar
  6. 6.
    Nierstrasz, O., Ducasse, S., Gîrba, T.: The story of Moose: an agile reengineering environment. In: Proceedings of the European Software Engineering Conference (ESEC/FSE 2005), pp. 1–10. ACM Press, New York (2005), doi:10.1145/1095430.1081707 (invited paper)Google Scholar
  7. 7.
    Chomsky, N.: Three models for the description of language. IRE Transactions on Information Theory 2, 113–124 (1956), http://www.chomsky.info/articles/195609--.pdf CrossRefzbMATHGoogle Scholar
  8. 8.
    Scott, E., Johnstone, A.: Gll parsing. Electron. Notes Theor. Comput. Sci. 253(7), 177–189 (2010), doi:10.1016/j.entcs.2010.08.041CrossRefGoogle Scholar
  9. 9.
    Grune, D., Jacobs, C.J.: Generalized LL Parsing. In: Parsing Techniques — A Practical Guide, vol. 1, ch. 11.2, pp. 391–398. Springer (2008)Google Scholar
  10. 10.
    Grune, D., Jacobs, C.J.: Deterministic Top-Down Parsing. In: Parsing Techniques — A Practical Guide, vol. 1, ch. 8, pp. 235–361. Springer (2008)Google Scholar
  11. 11.
    Aho, A.V., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques and Tools. Addison Wesley, Reading (1986)Google Scholar
  12. 12.
    Aho, A.V., Ullman, J.D.: The Theory of Parsing, Translation and Compiling Volume I: Parsing. Prentice-Hall (1972)Google Scholar
  13. 13.
    Lavie, A., Tomita, M.: Glr* - an efficient noise-skipping parsing algorithm for context free grammars. In: Proceedings of the Third International Workshop on Parsing Technologies, pp. 123–134 (1993)Google Scholar
  14. 14.
    Tomita, M.: Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems. Kluwer Academic Publishers, Norwell (1985)Google Scholar
  15. 15.
    Bischofberger, W.R.: Sniff: A pragmatic approach to a C++ programming environment. In: C++ Conference, pp. 67–82 (1992)Google Scholar
  16. 16.
    Asveld, P.: A fuzzy approach to erroneous inputs in context-free language recognition. In: Proceedings of the Fourth International Workshop on Parsing Technologies IWPT 1995, pp. 14–25. Institute of Formal and Applied Linguistics, Charles University (1995)Google Scholar
  17. 17.
    Koppler, R.: A systematic approach to fuzzy parsing. Software: Practice and Experience 27(6), 637–649 (1997), doi:10.1002/(SICI)1097-024X(199706)27:6<637:AID-SPE99>3.0.CO;2-3Google Scholar
  18. 18.
    Klusener, S., Lämmel, R.: Deriving tolerant grammars from a base-line grammar. In: Proceedings of the International Conference on Software Maintenance (ICSM 2003), pp. 179–188. IEEE Computer Society (2003), doi:10.1109/ICSM.2003.1235420Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Jan Kurš
    • 1
  • Mircea Lungu
    • 1
  • Oscar Nierstrasz
    • 1
  1. 1.Software Composition GroupUniversity of BernSwitzerland

Personalised recommendations