Three Learnable Models for the Description of Language

  • Alexander Clark
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6031)


Learnability is a vital property of formal grammars: representation classes should be defined in such a way that they are learnable. One way to build learnable representations is by making them objective or empiricist: the structure of the representation should be based on the structure of the language. Rather than defining a function from representation to language we should start by defining a function from the language to the representation: following this strategy gives classes of representations that are easy to learn. We illustrate this approach with three classes, defined in analogy to the lowest three levels of the Chomsky hierarchy. First, we recall the canonical deterministic finite automaton, where the states of the automaton correspond to the right congruence classes of the language. Secondly, we define context free grammars where the non-terminals of the grammar correspond to the syntactic congruence classes, and where the productions are defined by the syntactic monoid; finally we define a residuated lattice structure from the Galois connection between strings and contexts, which we call the syntactic concept lattice, and base a representation on this, which allows us to define a class of languages that includes some non-context free languages, many context-free languages and all regular languages. All three classes are efficiently learnable under suitable learning paradigms.


Residuated Lattice Regular Language Primitive Element Congruence Class Context Free Language 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Angluin, D.: Inference of reversible languages. Communications of the ACM 29, 741–765 (1982)zbMATHMathSciNetGoogle Scholar
  2. 2.
    Angluin, D., Kharitonov, M.: When won’t membership queries help? J. Comput. Syst. Sci. 50, 336–355 (1995)zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Boullier, P.: Chinese Numbers, MIX, Scrambling, and Range Concatenation Grammars. In: Proceedings of the 9th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 8–12 (1999)Google Scholar
  4. 4.
    Carrasco, R.C., Oncina, J.: Learning deterministic regular grammars from stochastic samples in polynomial time. Theoretical Informatics and Applications 33(1), 1–20 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Chomsky, N.: The Minimalist Program. MIT Press, Cambridge (1995)zbMATHGoogle Scholar
  6. 6.
    Chomsky, N.: Language and mind, 3rd edn. Cambridge Univ. Pr., Cambridge (2006)Google Scholar
  7. 7.
    Clark, A.: PAC-learning unambiguous NTS languages. In: Sakakibara, Y., Kobayashi, S., Sato, K., Nishino, T., Tomita, E. (eds.) ICGI 2006. LNCS (LNAI), vol. 4201, pp. 59–71. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  8. 8.
    Clark, A.: A learnable representation for syntax using residuated lattices. In: Proceedings of the 14th Conference on Formal Grammar, Bordeaux, France (2009)Google Scholar
  9. 9.
    Clark, A., Eyraud, R.: Polynomial identification in the limit of substitutable context-free languages. Journal of Machine Learning Research 8, 1725–1745 (2007)MathSciNetGoogle Scholar
  10. 10.
    Clark, A., Eyraud, R., Habrard, A.: A polynomial algorithm for the inference of context free languages. In: Clark, A., Coste, F., Miclet, L. (eds.) ICGI 2008. LNCS (LNAI), vol. 5278, pp. 29–42. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  11. 11.
    Clark, A., Thollard, F.: PAC-learnability of probabilistic deterministic finite state automata. Journal of Machine Learning Research 5, 473–497 (2004)MathSciNetGoogle Scholar
  12. 12.
    Conway, J.: Regular algebra and finite machines. Chapman and Hall, London (1971)zbMATHGoogle Scholar
  13. 13.
    Drášil, M.: A grammatical inference for C-finite languages. Archivum Mathematicum 25(2), 163–173 (1989)zbMATHGoogle Scholar
  14. 14.
    Evans, R., Gazdar, G.: DATR: A language for lexical knowledge representation. Computational Linguistics 22(2), 167–216 (1996)Google Scholar
  15. 15.
    Fernau, H., de la Higuera, C.: Grammar induction: An invitation for formal language theorists. Grammars 7, 45–55 (2004)Google Scholar
  16. 16.
    Gold, E.M.: Complexity of automaton identification from given data. Information and Control 37(3), 302–320 (1978)zbMATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Harris, Z.: Distributional structure. In: Fodor, J.A., Katz, J.J. (eds.) The Structure of Language, pp. 33–49. Prentice-Hall, Englewood Cliffs (1954)Google Scholar
  18. 18.
    Harrison, M.A.: Introduction to Formal Language Theory. Addison Wesley, Reading (1978)zbMATHGoogle Scholar
  19. 19.
    Holzer, M., Konig, B.: On deterministic finite automata and syntactic monoid size. In: Proc. Developments in Language Theory 2002 (2002)Google Scholar
  20. 20.
    Kříž, B.: Generalized grammatical categories in the sense of Kunze. Archivum Mathematicum 17(3), 151–158 (1981)zbMATHMathSciNetGoogle Scholar
  21. 21.
    Kulagina, O.S.: One method of defining grammatical concepts on the basis of set theory. Problemy Kiberneticy 1, 203–214 (1958) (in Russian)Google Scholar
  22. 22.
    Kunze, J.: Versuch eines objektivierten Grammatikmodells I, II. Z. Zeitschriff Phonetik Sprachwiss. Kommunikat, 20-21 (1967–1968)Google Scholar
  23. 23.
    Lambek, J.: The mathematics of sentence structure. American Mathematical Monthly 65(3), 154–170 (1958)zbMATHCrossRefMathSciNetGoogle Scholar
  24. 24.
    Lombardy, S., Sakarovitch, J.: The universal automaton. In: Grädel, E., Flum, J., Wilke, T. (eds.) Logic and Automata: History and Perspectives, pp. 457–494. Amsterdam Univ. Pr. (2008)Google Scholar
  25. 25.
    Martinek, P.: On a Construction of Context-free Grammars. Fundamenta Informaticae 44(3), 245–264 (2000)zbMATHMathSciNetGoogle Scholar
  26. 26.
    Novotny, M.: On some constructions of grammars for linear languages. International Journal of Computer Mathematics 17(1), 65–77 (1985)zbMATHCrossRefGoogle Scholar
  27. 27.
    Okhotin, A.: Conjunctive grammars. Journal of Automata, Languages and Combinatorics 6(4), 519–535 (2001)zbMATHMathSciNetGoogle Scholar
  28. 28.
    Păun, G.: Marcus contextual grammars. Kluwer Academic Pub., Dordrecht (1997)zbMATHGoogle Scholar
  29. 29.
    Pollard, C., Sag, I.: Head Driven Phrase Structure Grammar. University of Chicago Press, Chicago (1994)Google Scholar
  30. 30.
    Sénizergues, G.: The equivalence and inclusion problems for NTS languages. J. Comput. Syst. Sci. 31(3), 303–331 (1985)zbMATHCrossRefGoogle Scholar
  31. 31.
    Sestier, A.: Contribution à une théorie ensembliste des classifications linguistiques. In: Premier Congrès de l’Association Française de Calcul, Grenoble, pp. 293–305 (1960)Google Scholar
  32. 32.
    Shieber, S.: Evidence against the context-freeness of natural language. Linguistics and Philosophy 8, 333–343 (1985)CrossRefGoogle Scholar
  33. 33.
    Shirakawa, H., Yokomori, T.: Polynomial-time MAT Learning of C-Deterministic Context-free Grammars. Transactions of the information processing society of Japan 34, 380–390 (1993)Google Scholar
  34. 34.
    Yoshinaka, R.: Learning mildly context-sensitive languages with multidimensional substitutability from positive data. In: Gavaldà, R., Lugosi, G., Zeugmann, T., Zilles, S. (eds.) ALT 2009. LNCS, vol. 5809, pp. 278–292. Springer, Heidelberg (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Alexander Clark
    • 1
  1. 1.Department of Computer Science, Royal HollowayUniversity of LondonEgham

Personalised recommendations