Skip to main content

Derivative-Based Diagnosis of Regular Expression Ambiguity

  • Conference paper
  • First Online:
Implementation and Application of Automata (CIAA 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9705))

Included in the following conference series:

Abstract

Regular expressions are often ambiguous. We present a novel method based on Brzozowski’s derivatives to aid the user in diagnosing ambiguous regular expressions. We introduce a derivative-based finite state transducer to generate parse trees and minimal counter-examples. The transducer can be easily customized to either follow the POSIX or Greedy disambiguation policy and based on a finite set of examples it is possible to examine if there are any differences among POSIX and Greedy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://arxiv.org/abs/1604.06644.

  2. 2.

    Additional arguments are x and r but we use notation \({ i njs}_{d_{x}(r)} \ \) to highlight that the definition is defined by pattern match over the various cases of the derivative operation.

  3. 3.

    Technically, we treat the set of parse trees like a list. Recall that \({ a llEps}_{\cdot }\) and the simplification rule (Idemp) favor the left-most match. Alternatives keep their relative position in an expression.

References

  1. Berry, G., Sethi, R.: From regular expressions to deterministic automata. Theoret. Comput. Sci. 48(1), 117–126 (1986). http://dl.acm.org/citation.cfm?id=39528.39537

    Article  MathSciNet  MATH  Google Scholar 

  2. Book, R., Even, S., Greibach, S., Ott, G.: Ambiguity in graphs and expressions. IEEE Trans. Comput. 20(2), 149–153 (1971). http://dx.doi.org/10.1109/T-C.1971.223204

    Article  MathSciNet  MATH  Google Scholar 

  3. Borsotti, A., Breveglieri, L., Crespi Reghizzi, S., Morzenti, A.: From ambiguous regular expressions to deterministic parsing automata. In: Drewes, F. (ed.) CIAA 2015. LNCS, vol. 9223, pp. 35–48. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  4. Brabrand, C., Thomsen, J.G.: Typed and unambiguous pattern matching on strings using regular expressions. In: Proceedings of PPDP 2010, pp. 243–254. ACM (2010)

    Google Scholar 

  5. Brzozowski, J.A.: Derivatives of regular expressions. J. ACM 11(4), 481–494 (1964)

    Article  MathSciNet  MATH  Google Scholar 

  6. Frisch, A., Cardelli, L.: Greedy regular expression matching. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142, pp. 618–629. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  7. Institute of Electrical, Electronics Engineers (IEEE): Standard for information technology - Portable Operating System Interface (POSIX) - Part 2 (Shell and utilities), Section 2.8 (Regular expression notation), IEEE Standard 1003.2, New York (1992)

    Google Scholar 

  8. McNaughton, R., Yamada, H.: Regular expressions and finite state graphs for automata. IRE Trans. Electron. Comput. EC 9(1), 38–47 (1960)

    MATH  Google Scholar 

  9. Okui, S., Suzuki, T.: Disambiguation in regular expression matching via position automata with augmented transitions. In: Domaratzki, M., Salomaa, K. (eds.) CIAA 2010. LNCS, vol. 6482, pp. 231–240. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  10. PCRE - Perl Compatible Regular Expressions. http://www.pcre.org/

  11. Sulzmann, M., Lu, K.Z.M.: Regular expression sub-matching using partial derivatives. In: Proceedings of PPDP 2012, pp. 79–90. ACM (2012)

    Google Scholar 

  12. Sulzmann, M., Lu, K.Z.M.: POSIX regular expression parsing with derivatives. In: Codish, M., Sumii, E. (eds.) FLOPS 2014. LNCS, vol. 8475, pp. 203–220. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  13. Watson, B.W.: A taxonomy of finite automata minimization algorithmes. Computing Science Note 93/44. Eindhoven University of Technology, The Netherlands (1993)

    Google Scholar 

Download references

Acknowledgments

We thank Peter Thiemann and the reviewers for their comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Sulzmann .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Sulzmann, M., Lu, K.Z.M. (2016). Derivative-Based Diagnosis of Regular Expression Ambiguity. In: Han, YS., Salomaa, K. (eds) Implementation and Application of Automata. CIAA 2016. Lecture Notes in Computer Science(), vol 9705. Springer, Cham. https://doi.org/10.1007/978-3-319-40946-7_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-40946-7_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-40945-0

  • Online ISBN: 978-3-319-40946-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics