Learning stochastic finite automata from experts

  • Colin de la Higuera
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1433)


We present in this paper a new learning problem called learning distributions from experts. In the case we study the experts are stochastic deterministic finite automata (sdfa). We deal with the situation arising when wanting to learn sdfa from unrepeated examples. This is intended to model the situation where the data is not generated automatically, but in an order dependent of its probability, as would be the case with the data presented by a human expert. It is then impossible to use frequency measures directly in order to construct the underlying automaton or to adjust its probabilities. In this paper we prove that although a polynomial identification with probability one is not always possible, a wide class of automata can successfully, and for this criterion, be identified. As the framework is new the problem leads to a variety of open problems.


identification with probability one grammatical inference polynomial learning stochastic deterministic finite automata 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1. [AW92]
    Abe, N. & Warmuth, M.K. (1992): On the Computational Complexity of Approximating Distributions by Probabilistic Automata. Machine Learning 9, pp. 205–260.zbMATHGoogle Scholar
  2. 2. [Ang82]
    Angluin, D. (1982): Inference of reversible languages. Journal of the ACM 29 (3), pp. 741–765zbMATHMathSciNetCrossRefGoogle Scholar
  3. 3. [CO94]
    Carrasco, R.C. & Oncina J. (1994): Learning Stochastic Regular Grammars by means of a State Merging Method. Proceedings of the International Colloquium on Grammatical Inference ICGI-94 (pp. 139–152). Lecture Notes in Artificial Intelligence 862, Springer-Verlag.Google Scholar
  4. 4. [GV90]
    García, P. & Vidal, E. (1990): Inference of k-testable languages in the strict sense and application to syntactic pattern recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 12 (9), pp. 920–925.CrossRefGoogle Scholar
  5. 5. [GBE96]
    Goan, T., Benson, N. & Etzioni, O. (1996): A Grammar Inference Algorithm for the World Wide Web. In Proceedings of the 1996 AAAI Spring Symposium on Machine Learning in Information Access (MLIA '96), Stanford, CA, AAAI Press.Google Scholar
  6. 6. [Gol67]
    Gold, E.M. (1967): Language identification in the limit. Inform.&Control. 10, pp. 447–474.zbMATHCrossRefGoogle Scholar
  7. 7. [Gol78]
    Gold, E.M. (1978): Complexity of automaton identification from given data. Information and Control 37, pp. 302–320.zbMATHMathSciNetCrossRefGoogle Scholar
  8. 8. [HOV96]
    de la Higuera, C., Oncina, J. & Vidal, E. (1996): Identification of dfa: data-dependant Vs data-independent algorithms. Proceedings of the International Colloquium on Grammatical Inference ICGI-96 (pp. 313–326). Lecture Notes in Artificial Intelligence 1147, Springer-Verlag.Google Scholar
  9. 9. [Hoe63]
    Hoeffding, W. (1963): Probability inequalities for sums of bounded random variables. American Statistical Association Journal 58, pp. 13–30.zbMATHMathSciNetCrossRefGoogle Scholar
  10. 10. [K&al94]
    Kearns, M., Mansour, Y., Ron, D., Rubinfeld, R., Shapire, R.E. & Sellie, L. (1994): On the learnability of discrete distributions. In Proceedings of the 24th Annual ACM Symp. on Theory of Computing.Google Scholar
  11. 11. [LY90]
    Lari, K. & Young, S.J. (1990): The estimation of stochastic context free grammars using the inside outside algorithm, Comput. Speech. Language 4, pp 35–56.CrossRefGoogle Scholar
  12. 12. [L&al94]
    Lucas, S., Vidal, E., Amiri, A., Hanlon, S. & Amengual, J.C. (1994): A comparison of syntactic and statistical techniques for off-line OCR. Proceedings of the International Colloquium on Grammatical Inference ICGI-94 (pp. 168–179). Lecture Notes in Artificial Intelligence 862, Springer-Verlag.Google Scholar
  13. 13. [N95]
    Ney, H. (1995): Stochastic grammars and Pattern Recognition. In Speech Recognition and Understanding, edited by P. Laface and R. de Mori, Springer-Verlag, pp. 45–360.Google Scholar
  14. 14. [OG92]
    Oncina, J. & García, P. (1992): Inferring regular languages in polynomial time. In Pattern Recognition and Image Analysis, World Scientific.Google Scholar
  15. 15. [RJ93]
    Rabiner, L. &Juang, B. H. (1993): Fundamentals of Speech Recognition. Prentice-Hall.Google Scholar
  16. 16. [RST95]
    Ron, D., Singer, Y. & Tishby, N. (1995): On the Learnability and Usage of Acyclic Probabilistic Finite Automata. Proceedings of COLT 1995, pp 31–40.Google Scholar
  17. 17. [RV87]
    Rulot, H. & Vidal, E. (1987): Modelling (Sub)string-Length-Based Constraints through a grammatical Inference Method. In Pattern Recognition: Theory and Applications. Eds: Devijver and Kittler, pp.451–459, Springer Verlag.Google Scholar
  18. 18. [Sak97]
    Sakakibara, Y. (1997): Recent Advances of grammatical inference. Theoretical Computer Science 185, pp. 1545.MathSciNetCrossRefGoogle Scholar
  19. 19. [SO94]
    Stolcke, A. & Omohundro, S. (1994): Inducing Probabilistic Grammars by Bayesian Model Merging. In Proceedings of the International Colloquium on Grammatical Inference ICGI-94 (pp. 106–118). Lecture Notes in Artificial Intelligence 862, Springer-Verlag.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Colin de la Higuera
    • 1
  1. 1.EURISE, Université de Saint-EtienneFrance

Personalised recommendations