Minimal Synthesis of String to String Functions from Examples

  • Jad HamzaEmail author
  • Viktor Kunčak
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11388)


We study the problem of synthesizing string to string transformations from a set of input/output examples. The transformations we consider are expressed using a particular class of transducers: functional non-deterministic Mealy machines (f-NDMM). These are machines that read input letters one at a time, and output one letter at each step. The functionality constraint ensures that, even though the machine is locally non-deterministic, each input string is mapped to exactly one output string by the transducer.

We suggest that, given a set of input/output examples, the smallest f-NDMM consistent with the examples is a good candidate for the transformation the user was expecting. We therefore study the problem of, given a set of examples, finding a minimal f-NDMM consistent with the examples and satisfying the functionality and totality constraints mentioned above.

We prove that, in general, the decision problem corresponding to that question is \(\textsf {NP}\)-complete, and we provide several \(\textsf {NP}\)-hardness proofs that show the hardness of multiple variants of the problem.

Finally, we propose an algorithm for finding the minimal f-NDMM consistent with input/output examples, that uses a reduction to SMT solvers. We implemented the algorithm, and used it to evaluate the likelihood that the minimal f-NDMM indeed corresponds to the transformation expected by the user.



We thank Rupak Majumdar for references on hardness of the interpolation problem in the settings of automata.We thank anonymous reviewers of previous submissions for their thorough comments and for relevant references related to learning finite automata and regarding the interpolation problem.


  1. 1.
    Aarts, F., Kuppens, H., Tretmans, J., Vaandrager, F.W., Verwer, S.: Improving active mealy machine learning for protocol conformance testing. Mach. Learn. 96(1–2), 189–224 (2014). Scholar
  2. 2.
    Alur, R., et al.: Syntax-guided synthesis. In: Formal Methods in Computer-Aided Design (FMCAD), pp. 1–17. IEEE (2013)Google Scholar
  3. 3.
    Alur, R., Cerný, P.: Expressiveness of streaming string transducers. In: Lodaya, K., Mahajan, M. (eds.) IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2010. LIPIcs, Chennai, India, 15–18 December 2010, vol. 8, pp. 1–12. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2010)Google Scholar
  4. 4.
    Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75, 87–106 (1987)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Berstel, J.: Transductions and Context-Free Languages. Teubner Studienbücher: Informatik, vol. 38. Teubner, Stuttgart (1979). Scholar
  6. 6.
    Bojańczyk, M.: Transducers with origin information. In: Esparza, J., Fraigniaud, P., Husfeldt, T., Koutsoupias, E. (eds.) ICALP 2014. LNCS, vol. 8573, pp. 26–37. Springer, Heidelberg (2014). Scholar
  7. 7.
    Bollig, B., Habermehl, P., Kern, C., Leucker, M.: Angluin-style learning of NFA. In: Boutilier, C. (ed.) IJCAI 2009, Proceedings of the 21st International Joint Conference on Artificial Intelligence, Pasadena, California, USA, 11–17 July 2009, pp. 1004–1009 (2009).
  8. 8.
    Chen, Y.-F., Farzan, A., Clarke, E.M., Tsay, Y.-K., Wang, B.-Y.: Learning minimal separating DFA’s for compositional verification. In: Kowalewski, S., Philippou, A. (eds.) TACAS 2009. LNCS, vol. 5505, pp. 31–45. Springer, Heidelberg (2009). Scholar
  9. 9.
    Finkbeiner, B., Schewe, S.: Bounded synthesis. STTT 15(5–6), 519–539 (2013). Scholar
  10. 10.
    Gold, E.M.: Complexity of automaton identification from given data. Inf. Control 37(3), 302–320 (1978). Scholar
  11. 11.
    Grinchtein, O., Leucker, M., Piterman, N.: Inferring network invariants automatically. In: Furbach, U., Shankar, N. (eds.) IJCAR 2006. LNCS (LNAI), vol. 4130, pp. 483–497. Springer, Heidelberg (2006). Scholar
  12. 12.
    Gulwani, S.: Automating string processing in spreadsheets using input-output examples. In: Ball, T., Sagiv, M. (eds.) Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2011, Austin, TX, USA, 26–28 January 2011, pp. 317–330. ACM (2011).
  13. 13.
    Hamza, J., Jobstmann, B., Kuncak, V.: Synthesis for regular specifications over unbounded domains. In: Bloem, R., Sharygina, N. (eds.) Proceedings of 10th International Conference on Formal Methods in Computer-Aided Design, FMCAD 2010, Lugano, Switzerland, 20–23 October, pp. 101–109. IEEE (2010).
  14. 14.
    Hamza, J., Kuncak, V.: Minimal synthesis of string to string functions from examples. CoRR abs/1710.09208 (2017).
  15. 15.
    Khalili, A., Tacchella, A.: Learning nondeterministic mealy machines. In: Clark, A., Kanazawa, M., Yoshinaka, R. (eds.) Proceedings of the 12th International Conference on Grammatical Inference, ICGI 2014. JMLR Workshop and Conference Proceedings, Kyoto, Japan, 17–19 September 2014, vol. 34, pp. 109–123. (2014).
  16. 16.
    Mayer, M., Hamza, J., Kuncak, V.: Proactive synthesis of recursive tree-to-string functions from examples. In: Müller, P. (ed.) 31st European Conference on Object-Oriented Programming, ECOOP 2017. LIPIcs, Barcelona, Spain, 19–23 June 2017, vol. 74, pp. 19:1–19:30. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2017).
  17. 17.
    Mealy, G.H.: A method for synthesizing sequential circuits. Bell Labs Tech. J. 34(5), 1045–1079 (1955)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Merten, M.: Active automata learning for real life applications. Ph.D. thesis, Dortmund University of Technology (2013).
  19. 19.
    de Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008). Scholar
  20. 20.
    Oncina, J., García, P., Vidal, E.: Learning subsequential transducers for pattern recognition interpretation tasks. IEEE Trans. Pattern Anal. Mach. Intell. 15(5), 448–458 (1993). Scholar
  21. 21.
    Pitt, L., Warmuth, M.K.: The minimum consistent DFA problem cannot be approximated within any polynomial. J. ACM 40(1), 95–142 (1993). Scholar
  22. 22.
    Păun, G., Salomaa, A.: Thin and slender languages. Discrete Appl. Math. 61(3), 257–270 (1995). Scholar
  23. 23.
    Sakarovitch, J.: Elements of Automata Theory. Cambridge University Press, Cambridge (2009). Scholar
  24. 24.
    Smetsers, R., Fiterău-Broştean, P., Vaandrager, F.: Model learning as a satisfiability modulo theories problem. In: Klein, S.T., Martín-Vide, C., Shapira, D. (eds.) LATA 2018. LNCS, vol. 10792, pp. 182–194. Springer, Cham (2018). Scholar
  25. 25.
    Stearns, R.E., Hunt III, B.H.: On the equivalence and containment problems for unambiguous regular expressions, regular grammars and finite automata. SIAM J. Comput. 14(3), 598–611 (1985). Scholar
  26. 26.
    Vilar, J.M.: Query learning of subsequential transducers. In: Miclet, L., de la Higuera, C. (eds.) ICGI 1996. LNCS, vol. 1147, pp. 72–83. Springer, Heidelberg (1996). Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.LARAEPFLLausanneSwitzerland

Personalised recommendations