Skip to main content

Distributional Learning of Context-Free and Multiple Context-Free Grammars

  • Chapter
  • First Online:
Topics in Grammatical Inference

Abstract

This chapter reviews recent progress in distributional learning in grammatical inference as applied to learning context-free and multiple context-free grammars. We discuss the basic principles of distributional learning, and present two classes of representations, primal and dual, where primal approaches use nonterminals based on strings or sets of strings and dual approaches use nonterminals based on contexts or sets of contexts. We then present learning algorithms based on these two models using a variety of learning paradigms, and then discuss the natural extension to mildly context-sensitive formalisms, using multiple context-free grammars as a representative formalism.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    See Learning Grammars and Automata with Queries, de la Higuera (Chap. 3).

  2. 2.

    The original paper defining this [12] unfortunately contains some errors.

  3. 3.

    An infinite language L is said to have the constant-growth property if \(\exists k \in \mathbb {N}. \forall u \in L. \exists v \in L.\ 0< |v|-|u| \le k\).

  4. 4.

    It is worth noting that MCFGs may be much larger than the smallest equivalent Minimalist Grammar.

  5. 5.

    The notation adopted in this chapter follows Smullyan’s elementary formal systems [55] rather than [48].

  6. 6.

    We only consider non-deleting productions in this chapter.

  7. 7.

    Technically speaking, the OT for the highest dimension subsumes the other ones for lower dimensions, since m-contexts and m-words can be seen as special cases of n-contexts and n-words, respectively, for \(m < n\): e.g., \(l \Box r \Box \odot _2 \langle u,\lambda \rangle = l \Box r \odot u\). However, there are cases where it is more reasonable to exclude the empty string from consideration.

  8. 8.

    The original definition [66] has a slightly weaker form, where strings \(m,u_1,u_2,v_1,v_2\) are restricted to be non-empty strings. \(ab^*cd^*e\) is 2d-substitutable according to the original definition, but it is not the case in our simplified definition.

References

  1. Adriaans, P.: Learning shallow context-free languages under simple distributions. Tech. Rep. ILLC Report PP-1999-13, Institute for Logic, Language and Computation, Amsterdam (1999)

    Google Scholar 

  2. Angluin, D.: Learning regular sets from queries and counterexamples. Information and Computation 75(2), 87–106 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  3. Angluin, D., Kharitonov, M.: When won’t membership queries help? J. Comput. Syst. Sci. 50, 336–355 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  4. Boasson, L., Sénizergues, S.: NTS languages are deterministic and congruential. J. Comput. Syst. Sci. 31(3), 332–342 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  5. Brill, E., Magermann, D., Marcus, M., Santorini, B.: Deducing linguistic structure from the statistics of large corpora. In: Proceedings of the Third DARPA Workshop on Speech and Natural Language, pp. 275–282 (1990)

    Google Scholar 

  6. Chomsky, N.: The logical structure of linguistic theory. Ph.D. thesis, MIT (1955)

    Google Scholar 

  7. Chomsky, N.: Language and mind, 3rd edn. Cambridge University Press (2006)

    Google Scholar 

  8. Clark, A.: PAC-learning unambiguous NTS languages. In: Y. Sakakibara, S. Kobayashi, K. Sato, T. Nishino, E. Tomita (eds.) Grammatical Inference: Algorithms and Applications, Lecture Notes in Computer Science, vol. 4201, pp. 59–71. Springer Berlin Heidelberg (2006)

    Chapter  Google Scholar 

  9. Clark, A.: A learnable representation for syntax using residuated lattices. In: Proceedings of the 14th Conference on Formal Grammar. Bordeaux, France (2009). http://www.papers/alexcFG2009.pdf

  10. Clark, A.: Distributional learning of some context-free languages with a minimally adequate teacher. In: J. Sempere, P. García (eds.) Proceedings of ICGI, no. 6339 in LNCS, pp. 24–37. Springer (2010)

    Google Scholar 

  11. Clark, A.: Efficient, correct, unsupervised learning of context-sensitive languages. In: Proceedings of the Fourteenth Conference on Computational Natural Language Learning, pp. 28–37. Association for Computational Linguistics, Uppsala, Sweden (2010)

    Google Scholar 

  12. Clark, A.: Learning context free grammars with the syntactic concept lattice. In: J. Sempere, P. García (eds.) Grammatical Inference: Theoretical Results and Applications. Proceedings of the International Colloquium on Grammatical Inference, pp. 38–51. Springer (2010)

    Google Scholar 

  13. Clark, A.: Inference of inversion transduction grammars. In: Proceedings of ICML. Bellevue, Washington (2011)

    Google Scholar 

  14. Clark, A.: The syntactic concept lattice: Another algebraic theory of the context-free languages? Journal of Logic and Computation (2013). doi:10.1093/logcom/ext037

    Google Scholar 

  15. Clark, A.: Learning trees from strings: A strong learning algorithm for some context-free grammars. Journal of Machine Learning Research 14, 3537–3559 (2014)

    MathSciNet  MATH  Google Scholar 

  16. Clark, A., Eyraud, R.: Polynomial identification in the limit of substitutable context-free languages. Journal of Machine Learning Research 8, 1725–1745 (2007)

    MathSciNet  MATH  Google Scholar 

  17. Clark, A., Eyraud, R., Habrard, A.: Using contextual representations to efficiently learn context-free languages. Journal of Machine Learning Research 11, 2707–2744 (2010)

    MathSciNet  MATH  Google Scholar 

  18. Clark, A., Yoshinaka, R.: Beyond semilinearity: Distributional learning of parallel multiple context-free grammars. In: J. Heinz, C. de la Higuera, T. Oates (eds.) Proceedings of the Eleventh International Conference on Grammatical Inference, JMLR Workshop and Conference Proceedings, vol. 21, pp. 84–96 (2012)

    Google Scholar 

  19. Clark, A., Yoshinaka, R.: Distributional learning of parallel multiple context-free grammars. Machine Learning pp. 1–27 (2013). doi:10.1007/s10994-013-5403-2.

    Google Scholar 

  20. Dediu, A.H., Martín-Vide, C. (eds.): Language and Automata Theory and Applications - 6th International Conference, LATA 2012, A Coruña, Spain, March 5-9, 2012. Proceedings, Lecture Notes in Computer Science, vol. 7183. Springer (2012)

    Google Scholar 

  21. Eyraud, R., Janodet, J., Oates, T.: Learning substitutable binary plane graph grammars. In: Proceedings of ICGI, vol. 21, pp. 114–128 (2012)

    Google Scholar 

  22. Fisher, M.J.: Grammars with macro-like productions. Ph.D. thesis, Harvard University (1968)

    Google Scholar 

  23. Gold, E.M.: Language identification in the limit. Information and Computation 10(5), 447–474 (1967)

    MATH  Google Scholar 

  24. Harris, Z.: Distributional structure. Word 10(2-3), 146–62 (1954)

    Google Scholar 

  25. Hotz, G., Pitsch, G.: On parsing coupled-context-free languages. Theoretical Computer Science 161(1&2), 205–233 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  26. Huybrechts, R.A.C.: The weak inadequacy of context-free phrase structure grammars. In: G. de Haan, M. Trommelen, W. Zonneveld (eds.) Van Periferie naar Kern. Foris, Dordrecht, Holland (1984)

    Google Scholar 

  27. Joshi, A.K.: Tree adjoining grammars: how much context-sensitivity is required to provide reasonable structural descriptions? In: D.R. Dowty, L. Karttunen, A. Zwicky (eds.) Natural Language Parsing, pp. 206–250. Cambridge University Press, Cambridge, MA (1985)

    Chapter  Google Scholar 

  28. Joshi, A.K., Vijay-Shanker, K., Weir, D.J.: The convergence of mildly context-sensitive grammar formalisms. In: P. Sells, S.M. Shieber, T. Wasow (eds.) Foundational Issues in Natural Language Processing, pp. 31–81. MIT Press, Cambridge, MA (1991)

    Google Scholar 

  29. Kaji, Y., Nakanishi, R., Seki, H., Kasami, T.: The universal recognition problems for parallel multiple context-free grammars and for their subclasses. IEICE Transaction on Information and Systems E75-D(7), 499–508 (1992)

    Google Scholar 

  30. Kanazawa, M., Salvati, S.: The copying power of well-nested multiple context-free grammars. In: Language and Automata Theory and Applications, pp. 344–355. Springer (2010)

    Google Scholar 

  31. Kanazawa, M., Salvati, S.: Mix is not a tree-adjoining language. In: ACL (1), pp. 666–674. The Association for Computer Linguistics (2012)

    Google Scholar 

  32. Kasprzik, A., Yoshinaka, R.: Distributional learning of simple context-free tree grammars. In: J. Kivinen, C. Szepesvári, E. Ukkonen, T. Zeugmann (eds.) Algorithmic Learning Theory, Lecture Notes in Computer Science, vol. 6925, pp. 398–412. Springer (2011)

    Google Scholar 

  33. Keller, B., Lutz, R.: Evolutionary induction of stochastic context free grammars. Pattern Recognition 38(9), 1393–1406 (2005)

    Article  MATH  Google Scholar 

  34. Klein, D., Manning, C.D.: A generative constituent-context model for improved grammar induction. In: Proceedings of the 40th Annual Meeting of the ACL (2002)

    Google Scholar 

  35. Kracht, M.: The Mathematics of Language, Studies in Generative Grammar, vol. 63, pp. 408–409. Mouton de Gruyter (2003)

    Google Scholar 

  36. Kulagina, O.S.: One method of defining grammatical concepts on the basis of set theory. Problemy Kiberneticy 1, 203–214 (1958). (in Russian)

    Google Scholar 

  37. Kunze, J.: Versuch eines objektivierten Grammatikmodells I, II. Z. Zeitschriff Phonetik Sprachwiss. Kommunikat 20-21 (1967–1968)

    Google Scholar 

  38. Langley, P., Stromsten, S.: Learning context-free grammars with a simplicity bias. In: R. López de Mántaras, E. Plaza (eds.) Machine Learning: ECML 2000, Lecture Notes in Computer Science, vol. 1810, pp. 220–228. Springer Berlin Heidelberg (2000)

    Chapter  Google Scholar 

  39. Leiss, H.: Learning CFGs with the finite context property: A note on A. Clark’s algorithm (2012). Manuscript

    Google Scholar 

  40. Luque, F.M., Infante-Lopez, G.: PAC-learning unambiguous \(k,l\)-NTS\(^{\le }\) languages. In: J.M. Sempere, P. García (eds.) Grammatical Inference: Theoretical Results and Applications, Lecture Notes in Computer Science, vol. 6339, pp. 122–134. Springer Berlin Heidelberg (2010)

    Chapter  Google Scholar 

  41. Marcus, S.: Algebraic Linguistics; Analytical Models. Academic Press, New York (1967)

    Google Scholar 

  42. Myhill, J.: Review of On Syntactical Categories by Yehoshua Bar-Hillel. The Journal of Symbolic Logic 15(3), 220 (1950)

    Article  Google Scholar 

  43. Oncina, J., García, P., Vidal, E.: Learning subsequential transducers for pattern recognition interpretation tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 448–458 (1993)

    Article  Google Scholar 

  44. Pitt, L.: Inductive inference, DFAs, and computational complexity. In: Proceedings of 2nd Workshop on Analogical and Inductive Inference, Lecture Notes in Computer Science, vol. 397, pp. 18–44 (1989)

    Google Scholar 

  45. Rambow, O., Satta, G.: Independent parallelism in finite copying parallel rewriting systems. Theor. Comput. Sci. 223(1-2), 87–120 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  46. Sakakibara, Y.: Learning context-free grammars from structural data in polynomial time. Theoretical Computer Science 76(2-3), 223–242 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  47. Salvati, S.: MIX is a 2-MCFL and the word problem in \({\mathbb{Z}}^2\) is solved by a third-order collapsible pushdown automaton. Tech. Rep. Inria-00564552, version 1, INRIA (2011). URL http://hal.inria.fr/inria-00564552

  48. Seki, H., Matsumura, T., Fujii, M., Kasami, T.: On multiple context-free grammars. Theoretical Computer Science 88(2), 191–229 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  49. Sénizergues, G.: The equivalence and inclusion problems for NTS languages. Journal of Computer and System Sciences 31(3), 303–331 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  50. Sestier, A.: Contribution à une théorie ensembliste des classifications linguistiques. In: Premier Congrès de l’Association Française de Calcul, pp. 293–305. Grenoble (1960)

    Google Scholar 

  51. Shibata, C., Yoshinaka, R.: PAC learning of some subclasses of context-free grammars with basic distributional properties from positive data. In: S. Jain, R. Munos, F. Stephan, T. Zeugmann (eds.) ALT, Lecture Notes in Computer Science, vol. 8139, pp. 143–157. Springer (2013)

    Google Scholar 

  52. Shieber, S.M.: Evidence against the context-freeness of natural language. Linguistics and Philosophy 8, 333–343 (1985)

    Article  Google Scholar 

  53. Shinohara, T.: Rich classes inferrable from positive data – length-bounded elementary formal systems. Information and computation 108(2), 175–186 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  54. Shirakawa, H., Yokomori, T.: Polynomial-time MAT learning of c-deterministic context-free grammars. Transactions of the Information Processing Society of Japan 34, 380–390 (1993)

    Google Scholar 

  55. Smullyan, R.: Theory of Formal Systems. Princeton University Press (1961)

    Google Scholar 

  56. Stabler, E.: Derivational minimalism. In: C. Retoré (ed.) Logical aspects of computational linguistics (LACL 1996), pp. 68–95. Springer (1997)

    Google Scholar 

  57. van Helden, W.: Case and gender: Concept formation between morphology and syntax (II volumes). Studies in Slavic and General Linguistics. Rodopi, Amsterdam-Atlanta (1993)

    Google Scholar 

  58. van Zaanen, M.: ABL: Alignment-based learning. In: COLING 2000 - Proceedings of the 18th International Conference on Computational Linguistics, pp. 961–967 (2000)

    Google Scholar 

  59. Vijay-Shanker, K., Weir, D.J.: The equivalence of four extensions of context-free grammars. Mathematical Systems Theory 27(6), 511–546 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  60. Vijay-Shanker, K., Weir, D.J., Joshi, A.K.: Characterizing structural descriptions produced by various grammatical formalisms. In: Proceedings of the 25th annual meeting of Association for Computational Linguistics, pp. 104–111. Stanford (1987)

    Google Scholar 

  61. Wells, R.S.: Immediate constituents. Language 23(2), 81–117 (1947)

    Article  Google Scholar 

  62. Wurm, C.: Completeness of full Lambek calculus for syntactic concept lattices. In: Proceedings of the 17th conference on Formal Grammar 2012 (FG) (2012)

    Google Scholar 

  63. Yoshinaka, R.: Identification in the limit of \(k,l\)-substitutable context-free languages. In: A. Clark, F. Coste, L. Miclet (eds.) ICGI, Lecture Notes in Computer Science, vol. 5278, pp. 266–279. Springer (2008)

    Google Scholar 

  64. Yoshinaka, R.: Learning mildly context-sensitive languages with multidimensional substitutability from positive data. In: R. Gavaldà, G. Lugosi, T. Zeugmann, S. Zilles (eds.) ALT, Lecture Notes in Computer Science, vol. 5809, pp. 278–292. Springer (2009)

    Google Scholar 

  65. Yoshinaka, R.: Polynomial-time identification of multiple context-free languages from positive data and membership queries. In: J.M. Sempere, P. García (eds.) ICGI, pp. 230–244. Springer (2010)

    Google Scholar 

  66. Yoshinaka, R.: Efficient learning of multiple context-free languages with multidimensional substitutability from positive data. Theoretical Computer Science 412(19), 1821–1831 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  67. Yoshinaka, R.: Towards dual approaches for learning context-free grammars based on syntactic concept lattices. In: G. Mauri, A. Leporati (eds.) Developments in Language Theory, Lecture Notes in Computer Science, vol. 6795, pp. 429–440. Springer (2011)

    Google Scholar 

  68. Yoshinaka, R.: Integration of the dual approaches in the distributional learning of context-free grammars. In: Dediu and Martín-Vide [20], pp. 538–550

    Google Scholar 

  69. Yoshinaka, R., Clark, A.: Polynomial time learning of some multiple context-free languages with a minimally adequate teacher. In: P. de Groote, M.J. Nederhof (eds.) Formal Grammar: 15th and 16th International Conference on Formal Grammar, pp. 192–206. Springer (2012)

    Google Scholar 

  70. Yoshinaka, R., Kanazawa, M.: Distributional learning of abstract categorial grammars. In: S. Pogodalla, J.P. Prost (eds.) LACL, Lecture Notes in Computer Science, vol. 6736, pp. 251–266. Springer (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Clark .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Clark, A., Yoshinaka, R. (2016). Distributional Learning of Context-Free and Multiple Context-Free Grammars. In: Heinz, J., Sempere, J. (eds) Topics in Grammatical Inference. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-48395-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-48395-4_6

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-48393-0

  • Online ISBN: 978-3-662-48395-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics