How considering incompatible state mergings may reduce the DFA induction search tree

  • François Coste
  • Jacques Nicolas
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1433)


A simple and effective method for DFA induction from positive and negative samples is the state merging method. The corresponding search space may be tree-structured, considering two subspaces for a given pair of states: the subspace where states are merged and the subspace where states remain different. Choosing different pairs leads to different sizes of space, due to state mergings dependencies. Thus, ordering the successive choices of these pairs is an important issue. Starting from a constraint characterization of incompatible state mergings, we show that this characterization allows to achieve better choices, i.e. to reduce the size of the search tree. Within this framework, we address the issue of learning the set of all minimal compatible DFA's. We propose a pruning criterion and experiment with several ordering criteria. The prefix order and a new entropy based criterion have exhibit the best results in our test sets.


Grammatical inference DFA constraint system search tree 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. R. Alquézar and A. Sanfeliu, “Incremental grammatical inference from positive and negative data using unbiased finite state automata”, In Shape, Structure and Pattern Recognition, Proc. Int. Workshop on Structural and Syntactic Pattern Recogni-tion, SSPR'94, Nahariya (Israel), World Scientific Pub., Singapore, 291–300, 1995.Google Scholar
  2. Angluin, D. On the complexity of minimum inference of regular sets. Information and Control 29(3):741–765, 1978.Google Scholar
  3. Coste, F. and Nicolas, J.: Regular Inference as a Graph Coloring Problem. Workshop on Grammar Inference, Automata Induction, and Language Acquisition (ICML' 97), Nashville, TN, 1997.Google Scholar
  4. Dupont, P. Regular Grammatical Inference from Positive and Negative Samples by Genetic Search: the GIG method. ICGI'94, Grammatical inference and Applications 236–245. Springer Verlag, 1994.Google Scholar
  5. Dupont, P.; Miclet, L.; and E.Vidal. What is the search space of the regular inference ?ICGI'94, Grammatical inference and Applications 25–37. Springer Verlag, 1994.Google Scholar
  6. Dupont, P. Utilisation et apprentissage de modèles de langages pour la reconnaissance de la parole continue. Ph.D. Dissertation, Ecole Nationale Supérieure des Télécommunications, 1996.Google Scholar
  7. Giordano, J. Espace des versions et inférence de grammaires algébriques. In Journées Francophones d'Apprentissage, 1993.Google Scholar
  8. Higuera (de la), C.; Oncina, J.; and Vidal, E.: Identification of DFA: data-dependent versus data-independent algorithms. Grammatical inference Learning Syntax from Sentences, ICGI'96 311–325. Springer Verlag, 1996.Google Scholar
  9. Hirsh, H. Polynomial-time Learning with Version Spaces National Conference on Artificial Intelligence, San Jose, CA, 117–122 july 1992.Google Scholar
  10. Juillé, H., and Pollack, J.B. SAGE: a Sampling-based Heuristic for Tree Search. Submitted paper.Google Scholar
  11. Kearns, M., and Valiant, L. Cryptographic limitation on learning boolean formulae and finite automata. In Proceedings of the Twenty First Annual Symposium on Theory of Computing, 433–444, 1989.Google Scholar
  12. Lang, K. Random DFA's can be Approximately Learned from Sparse Uniform Examples. In proceedings of the fifth annual ACM Workshop on Computational Learning Theory 45–52, July 1992.Google Scholar
  13. Lang, K. Merge Order count NECI Tech Report, Sept26, 1997Google Scholar
  14. Lang, K., and Pearlmutter, B., and Price, R. Results of the Abbadingo One DFA Learning Competition and a New Evidence Driven State Merging Algorithm. Submitted paper.Google Scholar
  15. Miclet, L. Grammatical Inference. Syntactic and Structural Pattern Recognition: Theory and Applications, Bunke, H. and Sanfeliu, A. Eds, Singapore, World Scientific,237–290, 1990.Google Scholar
  16. Miclet, L., and de Gentille, C. Inférence grammaticale à partir d'exemples et de contreexemples: deux algorithmes optimaux: (big et rig) et une version heuristique (brig). JAVA94, Journées Acquisition, Validation, Apprentissage F1–F13, Strasbourg France, 1994.Google Scholar
  17. Mitchell, T. Generalization as search. Artificial Intelligence (18):203–226, 1982.Google Scholar
  18. Nicolas, J. Une représentation efficace pour les espaces de versions. 8ièmes JFA, Saint-Raphael, 1993.Google Scholar
  19. Oncina, J., and Garcia, P. Inferring regular languages in polynomial update time. Pattern Recognition and Image Analysis 49–61, 1993.Google Scholar
  20. Pao, T., and Carr, J. A solution for the syntactic induction inference problem for regular languages in Computer Language vol3 53–64, 1978.Google Scholar
  21. Parekh, R., and Honavar, V. Efficient Learning of Regular Languages using Teacher Supplied Positive Examples and Learner Generated Queries. In Proceedings of the Fifth UNB Conference on AI, Fredrickton, Canada. 195–203, August 1993.Google Scholar
  22. Quinlan, J.R. Induction of decision trees. Machine Learning (1):81–106, 1986.Google Scholar
  23. Sebag, M. Une approche par contraintes de l'espace des versions. RFIA '94 17–25,1994.Google Scholar
  24. Trakhenbrot, B. and Barzdin, Y., Finite Automata: Behavior and Synthesis Amsterdam, North Holland Pub. Comp, 1973.Google Scholar
  25. Vanlehn, K., and Ball, W. A version space approach to learning context-free grammars. Machine Learning (2):39–74, 1987.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • François Coste
    • 1
  • Jacques Nicolas
    • 1
  1. 1.IRISA-INRIACedexFrance

Personalised recommendations