Skip to main content

Learning Stochastic Deterministic Regular Languages

  • Conference paper
Grammatical Inference: Algorithms and Applications (ICGI 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3264))

Included in the following conference series:

Abstract

We propose in this article a new practical algorithm for inferring μ-distinguishable stochastic deterministic regular languages. We prove that this algorithm will infer, with high probability, an automaton isomorphic to the target when given a polynomial number of examples. We discuss the links between the error function used to evaluate the inferred model and the learnability of the model class in a PAC like framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abe, N., Warmuth, W.: On the computational complexity of approximating distributions by probabilistic automata. In: Wshop on COLT, pp. 52–66 (1998)

    Google Scholar 

  2. Adriaans, P.W., Fernau, H., van Zaanen, M. (eds.): ICGI 2002. LNCS (LNAI), vol. 2484. Springer, Heidelberg (2002)

    MATH  Google Scholar 

  3. Angluin, D.: Identifying languages from stochastic examples. Technical Report YALEU/ DCS/RR-614, Yale University, Dept. of Computer Science (1988)

    Google Scholar 

  4. Brants, T.: Estimating Markov model structures. In: ICSLP 1996 (1996)

    Google Scholar 

  5. Carrasco, R.C., Oncina, J.: Learning stochastic regular grammars by means of a state merging method. In: Carrasco, R.C., Oncina, J. (eds.) ICGI 1994. LNCS, vol. 862, pp. 139–152. Springer, Heidelberg (1994)

    Google Scholar 

  6. Carrasco, R.C., Oncina, J.: Learning deterministic regular grammars from stochastic samples in polynomial time. Theoretical Informatics and Applications 33(1), 1–20 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  7. Clark, A., Thollard, F.: PAC-learnability of probabilistic deterministic finite state automata. Jrl of Machine Learning Research 5, 473–497 (2004)

    MathSciNet  Google Scholar 

  8. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley Interscience Publication, Hoboken (1991)

    Book  MATH  Google Scholar 

  9. de la Higuera, C., Thollard, F.: Identification in the limit with probability one of stochastic deterministic finite automata. In: de Oliveira [10]

    Google Scholar 

  10. de Oliveira, A.: ICGI 2000. LNCS (LNAI), vol. 1891. Springer, Heidelberg (2000)

    Book  MATH  Google Scholar 

  11. Dupont, P.: Smoothing probabilistic automata: an error-correcting approach. In: de Oliveira [10], pp. 51–64

    Google Scholar 

  12. Dupont, P., Chase, L.: Using symbol clustering to improve probabilistic automaton inference. In: Honavar, V.G., Slutzki, G. (eds.) ICGI 1998. LNCS (LNAI), vol. 1433, pp. 232–243. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  13. Esposito, Y., Lemay, A., Denis, F., Dupont, P.: Learning probabilistic residual finite state automata. In: Adriaans et al. [2], pp. 77–91

    Google Scholar 

  14. Freitag, D.: Using grammatical inference to improve precision in information extraction. In: Workshop on Grammatical Inference, Automata Induction, and Language Acquisition (1997)

    Google Scholar 

  15. Kearns, M.J., Mansour, Y., Ron, D., Rubinfeld, R., Schapire, R.E., Sellie, L.: On the learnability of discrete distributions. In: Proc. of the 25th Annual ACM Symposium on Theory of Computing, pp. 273–282 (1994)

    Google Scholar 

  16. Kermorvant, C., Dupont, P.: Stochastic grammatical inference with multinomial tests. In: Adriaans et al. [2], pp. 149–160

    Google Scholar 

  17. Llorens, D., Vilar, J.M., Casacuberta, F.: Finite state language models smoothed using n-grams. Int. Jrnl of Pattern Recognition and Artificial Intelligence 16(3), 275–289 (2002)

    Article  Google Scholar 

  18. McAllester, D., Shapire, R.: On the convergence rate of the good-turing estimators. In: Thirteenth Annual Conf. on COLT, pp. 1–66 (2000)

    Google Scholar 

  19. Mohri, M., Pereira, F., Riley, M.: Weighted automata in text and speech processing. In: Workshop on Extended Finite-State Models of Language (1996)

    Google Scholar 

  20. Parekh, R., Honavar, H.: Learning DFA from simple examples. In: International Coloquium on Machine Lerning, ICML 1997 (1997)

    Google Scholar 

  21. Pla, F., Molina, A., Prieto, N.: An Integrated Statistical Model for Tagging and Chunking Unrestricted Text. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2000. LNCS (LNAI), vol. 1902, pp. 15–20. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  22. Ron, D., Singer, Y., Tishby, N.: On the learnability and usage of acyclic probabilistic finite automata. In: COLT 1995, USA, pp. 31–40. ACM, New York (1995)

    Chapter  Google Scholar 

  23. Stolcke, A.: Bayesian Learning of Probabilistic Language Models. PhD thesis, Dept. of Electrical Engineering and Computer Science, University of California at Berkeley (1994)

    Google Scholar 

  24. Thollard, F.: Improving probabilistic grammatical inference core algorithms with post-processing techniques. In: ICML, pp. 561–568 (2001)

    Google Scholar 

  25. Thollard, F., Clark, A.: Shallow parsing using probabilistic grammatical inference. In: Adriaans et al. [2], pp. 269–282

    Google Scholar 

  26. Thollard, F., Dupont, P., de la Higuera, C.: Probabilistic DFA inference using Kullback-Leibler divergence and minimality. In: Langley, P. (ed.) ICML (2000)

    Google Scholar 

  27. Young-Lai, M., Tompa, F.W.: Stochastic grammatical inference of text database structure. Machine Learning 40(2), 111–137 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Thollard, F., Clark, A. (2004). Learning Stochastic Deterministic Regular Languages. In: Paliouras, G., Sakakibara, Y. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2004. Lecture Notes in Computer Science(), vol 3264. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30195-0_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30195-0_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23410-4

  • Online ISBN: 978-3-540-30195-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics