Towards General Algorithms for Grammatical Inference

  • Alexander Clark
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6331)


Many algorithms for grammatical inference can be viewed as instances of a more general algorithm which maintains a set of primitive elements, which distributionally define sets of strings, and a set of features or tests that constrain various inference rules. Using this general framework, which we cast as a process of logical inference, we re-analyse Angluin’s famous lstar algorithm and several recent algorithms for the inference of context-free grammars and multiple context-free grammars. Finally, to illustrate the advantages of this approach, we extend it to the inference of functional transductions from positive data only, and we present a new algorithm for the inference of finite state transducers.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Gold, E.M.: Language identification in the limit. Information and control 10(5), 447–474 (1967)zbMATHCrossRefGoogle Scholar
  2. 2.
    Kearns, M., Valiant, G.: Cryptographic limitations on learning boolean formulae and finite automata. JACM 41(1), 67–95 (1994)zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Angluin, D., Kharitonov, M.: When won’t membership queries help? J. Comput. Syst. Sci. 50, 336–355 (1995)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Clark, A.: Three learnable models for the description of language. In: Dediu, A.-H., Fernau, H., Martín-Vide, C. (eds.) Language and Automata Theory and Applications. LNCS, vol. 6031, pp. 16–31. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  5. 5.
    Pereira, F., Warren, D.: Parsing as deduction. In: Proceedings of the 21st annual meeting of the Association for Computational Linguistics, Association for Computational Linguistics, pp. 137–144 (1983)Google Scholar
  6. 6.
    Yoshinaka, R.: Identification in the limit of k, l-substitutable context-free languages. In: Proceedings of the 9th International colloquium on Grammatical Inference, pp. 266–279. Springer, Heidelberg (2008)Google Scholar
  7. 7.
    Angluin, D.: Learning regular sets from queries and counterexamples. Information and Computation 75(2), 87–106 (1987)zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Dupont, P., Miclet, L., Vidal, E.: What is the search space of the regular inference? In: Carrasco, R.C., Oncina, J. (eds.) ICGI 1994. LNCS, vol. 862, pp. 25–37. Springer, Heidelberg (1994)Google Scholar
  9. 9.
    Sempere, J., Garcia, P.: A Characterization of Even Linear Languages and its Application to the Learning Problem. In: Proceedings of the Second International Colloquium on Grammatical Inference and Applications, pp. 38–44. Springer, Heidelberg (1994)Google Scholar
  10. 10.
    Clark, A., Eyraud, R.: Polynomial identification in the limit of substitutable context-free languages. Journal of Machine Learning Research 8, 1725–1745 (2007)MathSciNetGoogle Scholar
  11. 11.
    Clark, A.: PAC-learning unambiguous NTS languages. In: Proceedings of the 8th International Colloquium on Grammatical Inference (ICGI), 59–71 (2006)Google Scholar
  12. 12.
    Clark, A.: Distributional learning of some context-free languages with a minimally adequate teacher. In: Proceedings of the ICGI, Valencia, Spain (September 2010)Google Scholar
  13. 13.
    Clark, A.: Learning context free grammars with the syntactic concept lattice. In: Proceedings of the ICGI, Valencia, Spain (September 2010)Google Scholar
  14. 14.
    Clark, A., Eyraud, R., Habrard, A.: A polynomial algorithm for the inference of context free languages. In: Proceedings of the International Colloquium on Grammatical Inference, September 2008, pp. 29–42. Springer, Heidelberg (2008)Google Scholar
  15. 15.
    Clark, A.: A learnable representation for syntax using residuated lattices. In: Proceedings of the 14th Conference on Formal Grammar, Bordeaux, France (2009)Google Scholar
  16. 16.
    Yoshinaka, R.: Polynomial-time identification of multiple context-free languages from positive data and membership queries. In: Proceedings of the International Colloquium on Grammatical Inference (2010)Google Scholar
  17. 17.
    Oncina, J., García, P., Vidal, E.: Learning subsequential transducers for pattern recognition interpretation tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 448–458 (1993)CrossRefGoogle Scholar
  18. 18.
    Okhotin, A.: Conjunctive grammars. Journal of Automata, Languages and Combinatorics 6(4), 519–535 (2001)zbMATHMathSciNetGoogle Scholar
  19. 19.
    Yoshinaka, R.: Learning mildly context-sensitive languages with multidimensional substitutability from positive data. In: Gavaldà, R., Lugosi, G., Zeugmann, T., Zilles, S. (eds.) ALT 2009. LNCS, vol. 5809, pp. 278–292. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  20. 20.
    Seki, H., Matsumura, T., Fujii, M., Kasami, T.: On multiple context-free grammars. Theoretical Computer Science 88(2), 229 (1991)CrossRefMathSciNetGoogle Scholar
  21. 21.
    Mohri, M.: Finite-state transducers in language and speech processing. Computational Linguistics 23(2), 269–311 (1997)MathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Alexander Clark
    • 1
  1. 1.Department of Computer ScienceRoyal Holloway, University of London 

Personalised recommendations