Abstract
We describe the formalization of a certified algorithm for regular expression parsing based on Brzozowski derivatives, in the dependently typed language Idris. The formalized algorithm produces a proof that an input string matches a given regular expression or a proof that no matching exists. A tool for regular expression based search in the style of the well known GNU grep has been developed with the certified algorithm, and practical experiments were conducted with this tool.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Associativity, Commutativity and Idempotence with Unit elements axioms for REs [7].
- 2.
In Haskell, types are classified using kinds [28] instead of universe levels. The kind of types is denoted by \(\star \) and type operators have functional kinds: \(\kappa \rightarrow \kappa '\), where \(\kappa \) and \(\kappa '\) are kinds. As an example, in Haskell, type Bool has kind \(\star \) and the list type constructor has kind \(\star \rightarrow \star \).
- 3.
Also known as Curry-Howard “isomorphism” [30].
- 4.
References
Almeida, J.B., Moreira, N., Pereira, D., de Sousa, S.M.: Partial derivative automata formalized in Coq. In: Domaratzki, M., Salomaa, K. (eds.) CIAA 2010. LNCS, vol. 6482, pp. 59–68. Springer, Heidelberg (2011)
Antimirov, V.: Partial derivatives of regular expressions and finite automaton constructions. Theor. Comput. Sci. 155(2), 291–319 (1996)
Bernardy, J.-P., Jansson, P.: Certified context-free parsing: a formalisation of valiant’s algorithm in agda. CoRR, abs/1601.07724 (2016)
Bertot, Y., Castran, P.: Interactive Theorem Proving and Program Development: Coq’Art The Calculus of Inductive Constructions, 1st edn. Springer Publishing Company, Incorporated, Heidelberg (2010)
Brady, E.: Idris, a general-purpose dependently typed programming language: design and implementation. J. Funct. Program. 23, 552–593 (2013)
Brady, E.: Programming and reasoning with algebraic effects and dependent types. In: Proceedings of the 18th ACM SIGPLAN International Conference on Functional Programming, ICFP 2013, pp. 133–144. ACM, New York (2013)
Brzozowski, J.A.: Derivatives of regular expressions. J. ACM 11(4), 481–494 (1964)
Danielsson, N.A.: Total parser combinators. SIGPLAN Not. 45(9), 285–296 (2010)
Felleisen, M.D., Barski, C., Van Horn, D.: Realm of Racket: Learn to Program, One Game at a Time! Eight Students of Northeastern University, San Francisco (2013)
Firsov, D., Uustalu, T.: Certified parsing of regular languages. In: Gonthier, G., Norrish, M. (eds.) CPP 2013. LNCS, vol. 8307, pp. 98–113. Springer, Heidelberg (2013)
Firsov, D., Uustalu, T.: Certified CYK parsing of context-free languages. J. Log. Algebr. Meth. Program. 83(5–6), 459–468 (2014)
Firsov, D., Uustalu, T.: Certified CYK parsing of context-free languages. J. Log. Algebr. Methods Program. 83(5–6), 459–468 (2014). The 24th Nordic Workshop on Programming Theory (NWPT 2012)
Fischer, S., Huch, F., Wilke, T.: A play on regular expressions: Functional pearl. In: Proceedings of the 15th ACM SIGPLAN International Conference on Functional Programming, ICFP 2010, pp. 357–368. ACM, New York (2010)
Frisch, A., Cardelli, L.: Greedy regular expression matching. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142, pp. 618–629. Springer, Heidelberg (2004)
GNU Grep home page. https://www.gnu.org/software/grep/
Hopcroft, J.E., Motwani, R., Rotwani, U., Ullman, J.D.: Introduction to Automata Theory Languages and Computability, 2nd edn. Addison-Wesley Longman Publishing Co., Inc., Boston (2000)
The Idris Tutorial. http://docs.idris-lang.org/en/latest/tutorial/
Jourdan, J.-H., Pottier, F., Leroy, X.: Validating LR(1) parsers. In: Seidl, H. (ed.) Programming Languages and Systems. LNCS, vol. 7211, pp. 397–416. Springer, Heidelberg (2012)
Lesk, M.E., Schmidt, E.: Lex: a lexical analyzer generator. In: Unix, vol. ii. pp. 375–387. W. B. Saunders Company, Philadelphia, PA, USA (1990)
The Lightyear Idris Parsing Combinator Library. https://github.com/ziman/lightyear/
Lopes, R., Ribeiro, R., Camarão, C.: Certified derivative based parsing of regular expressions — on-linerepository (2016). https://github.com/raulfpl/idrisregexp
Macedo, H.D., Oliveira, J.N.: Typing linear algebra: a biproduct-oriented approach. CoRR, abs/1312.4818 (2013)
McBride, C.: Dependently Typed Functional Programs and their Proofs. Ph.D. thesis, Department of Informatics, University of Edinburgh (1999)
McBride, C., McKinna, J.: The view from the left. J. Funct. Program. 14(1), 69–111 (2004)
Might, M., Darais, D., Spiewak, D.: Parsing with derivatives: a functional pearl. SIGPLAN Not. 46(9), 189–195 (2011)
Norell, U.: Dependently typed programming in agda. In: Proceedings of the 4th International Workshop on Types in Language Design and Implementation, TLDI 2009, pp. 1–2. ACM, New York (2009)
Owens, S., Reppy, J., Turon, A.: Regular-expression derivatives re-examined. J. Funct. Program. 19(2), 173–190 (2009)
Pierce, B.C.: Types and Programming Languages, 1st edn. The MIT Press, Cambridge (2002)
Google Regular Expression Library - re2. https://github.com/google/re2
Sørensen, M.H., Urzyczyn, P.: Lectures on the Curry-Howard Isomorphism. Studies in Logic and the Foundations of Mathematics, vol. 149. Elsevier Science Inc., New York (2006)
Sulzmann, M., Lu, K.Z.M.: POSIX regular expression parsing with derivatives. In: Codish, M., Sumii, E. (eds.) FLOPS 2014. LNCS, vol. 8475, pp. 203–220. Springer, Heidelberg (2014)
The Univalent Foundatiosn Program. Homotopy Type Theory: Univalent Foundations of Mathematics (2013). http://homotopytypetheory.org/book/
Valiant, L.G.: General context-free recognition in less than cubic time. J. Comput. Syst. Sci. 10(2), 308–315 (1975)
Acknowledgements
The first author thanks Fundação de Amparo a Pesquisa de Minas Gerais (FAPEMIG) for financial support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Lopes, R., Ribeiro, R., Camarão, C. (2016). Certified Derivative-Based Parsing of Regular Expressions. In: Castor, F., Liu, Y. (eds) Programming Languages. SBLP 2016. Lecture Notes in Computer Science(), vol 9889. Springer, Cham. https://doi.org/10.1007/978-3-319-45279-1_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-45279-1_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45278-4
Online ISBN: 978-3-319-45279-1
eBook Packages: Computer ScienceComputer Science (R0)