Three Learnable Models for the Description of Language

Clark, Alexander

doi:10.1007/978-3-642-13089-2_2

Alexander Clark¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6031))

Included in the following conference series:

International Conference on Language and Automata Theory and Applications

890 Accesses
7 Citations

Abstract

Learnability is a vital property of formal grammars: representation classes should be defined in such a way that they are learnable. One way to build learnable representations is by making them objective or empiricist: the structure of the representation should be based on the structure of the language. Rather than defining a function from representation to language we should start by defining a function from the language to the representation: following this strategy gives classes of representations that are easy to learn. We illustrate this approach with three classes, defined in analogy to the lowest three levels of the Chomsky hierarchy. First, we recall the canonical deterministic finite automaton, where the states of the automaton correspond to the right congruence classes of the language. Secondly, we define context free grammars where the non-terminals of the grammar correspond to the syntactic congruence classes, and where the productions are defined by the syntactic monoid; finally we define a residuated lattice structure from the Galois connection between strings and contexts, which we call the syntactic concept lattice, and base a representation on this, which allows us to define a class of languages that includes some non-context free languages, many context-free languages and all regular languages. All three classes are efficiently learnable under suitable learning paradigms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Angluin, D.: Inference of reversible languages. Communications of the ACM 29, 741–765 (1982)
MATH MathSciNet Google Scholar
Angluin, D., Kharitonov, M.: When won’t membership queries help? J. Comput. Syst. Sci. 50, 336–355 (1995)
Article MATH MathSciNet Google Scholar
Boullier, P.: Chinese Numbers, MIX, Scrambling, and Range Concatenation Grammars. In: Proceedings of the 9th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 8–12 (1999)
Google Scholar
Carrasco, R.C., Oncina, J.: Learning deterministic regular grammars from stochastic samples in polynomial time. Theoretical Informatics and Applications 33(1), 1–20 (1999)
Article MATH MathSciNet Google Scholar
Chomsky, N.: The Minimalist Program. MIT Press, Cambridge (1995)
MATH Google Scholar
Chomsky, N.: Language and mind, 3rd edn. Cambridge Univ. Pr., Cambridge (2006)
Google Scholar
Clark, A.: PAC-learning unambiguous NTS languages. In: Sakakibara, Y., Kobayashi, S., Sato, K., Nishino, T., Tomita, E. (eds.) ICGI 2006. LNCS (LNAI), vol. 4201, pp. 59–71. Springer, Heidelberg (2006)
Chapter Google Scholar
Clark, A.: A learnable representation for syntax using residuated lattices. In: Proceedings of the 14th Conference on Formal Grammar, Bordeaux, France (2009)
Google Scholar
Clark, A., Eyraud, R.: Polynomial identification in the limit of substitutable context-free languages. Journal of Machine Learning Research 8, 1725–1745 (2007)
MathSciNet Google Scholar
Clark, A., Eyraud, R., Habrard, A.: A polynomial algorithm for the inference of context free languages. In: Clark, A., Coste, F., Miclet, L. (eds.) ICGI 2008. LNCS (LNAI), vol. 5278, pp. 29–42. Springer, Heidelberg (2008)
Chapter Google Scholar
Clark, A., Thollard, F.: PAC-learnability of probabilistic deterministic finite state automata. Journal of Machine Learning Research 5, 473–497 (2004)
MathSciNet Google Scholar
Conway, J.: Regular algebra and finite machines. Chapman and Hall, London (1971)
MATH Google Scholar
Drášil, M.: A grammatical inference for C-finite languages. Archivum Mathematicum 25(2), 163–173 (1989)
MATH Google Scholar
Evans, R., Gazdar, G.: DATR: A language for lexical knowledge representation. Computational Linguistics 22(2), 167–216 (1996)
Google Scholar
Fernau, H., de la Higuera, C.: Grammar induction: An invitation for formal language theorists. Grammars 7, 45–55 (2004)
Google Scholar
Gold, E.M.: Complexity of automaton identification from given data. Information and Control 37(3), 302–320 (1978)
Article MATH MathSciNet Google Scholar
Harris, Z.: Distributional structure. In: Fodor, J.A., Katz, J.J. (eds.) The Structure of Language, pp. 33–49. Prentice-Hall, Englewood Cliffs (1954)
Google Scholar
Harrison, M.A.: Introduction to Formal Language Theory. Addison Wesley, Reading (1978)
MATH Google Scholar
Holzer, M., Konig, B.: On deterministic finite automata and syntactic monoid size. In: Proc. Developments in Language Theory 2002 (2002)
Google Scholar
Kříž, B.: Generalized grammatical categories in the sense of Kunze. Archivum Mathematicum 17(3), 151–158 (1981)
MATH MathSciNet Google Scholar
Kulagina, O.S.: One method of defining grammatical concepts on the basis of set theory. Problemy Kiberneticy 1, 203–214 (1958) (in Russian)
Google Scholar
Kunze, J.: Versuch eines objektivierten Grammatikmodells I, II. Z. Zeitschriff Phonetik Sprachwiss. Kommunikat, 20-21 (1967–1968)
Google Scholar
Lambek, J.: The mathematics of sentence structure. American Mathematical Monthly 65(3), 154–170 (1958)
Article MATH MathSciNet Google Scholar
Lombardy, S., Sakarovitch, J.: The universal automaton. In: Grädel, E., Flum, J., Wilke, T. (eds.) Logic and Automata: History and Perspectives, pp. 457–494. Amsterdam Univ. Pr. (2008)
Google Scholar
Martinek, P.: On a Construction of Context-free Grammars. Fundamenta Informaticae 44(3), 245–264 (2000)
MATH MathSciNet Google Scholar
Novotny, M.: On some constructions of grammars for linear languages. International Journal of Computer Mathematics 17(1), 65–77 (1985)
Article MATH Google Scholar
Okhotin, A.: Conjunctive grammars. Journal of Automata, Languages and Combinatorics 6(4), 519–535 (2001)
MATH MathSciNet Google Scholar
Păun, G.: Marcus contextual grammars. Kluwer Academic Pub., Dordrecht (1997)
MATH Google Scholar
Pollard, C., Sag, I.: Head Driven Phrase Structure Grammar. University of Chicago Press, Chicago (1994)
Google Scholar
Sénizergues, G.: The equivalence and inclusion problems for NTS languages. J. Comput. Syst. Sci. 31(3), 303–331 (1985)
Article MATH Google Scholar
Sestier, A.: Contribution à une théorie ensembliste des classifications linguistiques. In: Premier Congrès de l’Association Française de Calcul, Grenoble, pp. 293–305 (1960)
Google Scholar
Shieber, S.: Evidence against the context-freeness of natural language. Linguistics and Philosophy 8, 333–343 (1985)
Article Google Scholar
Shirakawa, H., Yokomori, T.: Polynomial-time MAT Learning of C-Deterministic Context-free Grammars. Transactions of the information processing society of Japan 34, 380–390 (1993)
Google Scholar
Yoshinaka, R.: Learning mildly context-sensitive languages with multidimensional substitutability from positive data. In: Gavaldà, R., Lugosi, G., Zeugmann, T., Zilles, S. (eds.) ALT 2009. LNCS, vol. 5809, pp. 278–292. Springer, Heidelberg (2009)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Royal Holloway, University of London, Egham, TW20 0EX
Alexander Clark

Authors

Alexander Clark
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Research Group on Mathematical Linguistics, Universitat Rovira i Virgili, Avinguda Catalunya, 35, 43002, Tarragona, Spain
Adrian-Horia Dediu & Carlos Martín-Vide &
Fachbereich IV - Informatik, Universität Trier, Campus II, Behringstraße, 54286, Trier, Germany
Henning Fernau

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Clark, A. (2010). Three Learnable Models for the Description of Language. In: Dediu, AH., Fernau, H., Martín-Vide, C. (eds) Language and Automata Theory and Applications. LATA 2010. Lecture Notes in Computer Science, vol 6031. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13089-2_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-13089-2_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13088-5
Online ISBN: 978-3-642-13089-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics