Learning Tree Languages from Text

  • Henning Fernau
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2375)


We study the problem of learning regular tree languages from text. We show that the framework of function distinguishability as introduced in our ALT 2000 paper is generalizable from the case of string languages towards tree languages, hence providing a large source of identifiable classes of regular tree languages. Each of these classes can be characterized in various ways. Moreover, we present a generic inference algorithm with polynomial update time and prove its correctness. In this way, we generalize previous works of Angluin, Sakakibara and ourselves. Moreover, we show that this way all regular tree languages can be identified approximately.


Regular Language Inference Algorithm Derivation Tree Tree Automaton Tree Language 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    D. Angluin. Inductive inference of formal languages from positive data. Information and Control, 45:117–135, 1980.zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    D. Angluin. Inference of reversible languages. Journal of the Association for Computing Machinery, 29(3):741–765, 1982.zbMATHMathSciNetGoogle Scholar
  3. 3.
    M. Bernard and C. de la Higuera. GIFT: Grammatical Inference For Terms. International Conference on Inductive Logic Programming ILP. Late Breaking Paper, 1999. French journal version: Apprentissage de programmes logiques par inférence grammaticale. Revue d’Intelligence Artificielle, 14:375–396, 2001.Google Scholar
  4. 4.
    J. Besombes and J.-Y. Marion. Identification of reversible dependency tree languages. In L. Popelínský and M. Nepil, editors, Proceedings of 3rd Workshop on Learning Languages in Logic LLL’01, pages 11–22, 2001.Google Scholar
  5. 5.
    S. Crespi-Reghizzi, M. A. Melkanoff, and L. Lichten. The use of grammatical inference for designing programming languages. Communications of the ACM, 16:83–90, 1972.CrossRefMathSciNetGoogle Scholar
  6. 6.
    L. F. Fass. Learning context-free languages from their structured sentences. SIGACT News, 15(3):24–35, 1983.CrossRefGoogle Scholar
  7. 7.
    H. Fernau. Identification of function distinguishable languages. In H. Arimura, S. Jain, and A. Sharma, editors, Proceedings of the 11th International Conference Algorithmic Learning Theory ALT, volume 1968 of LNCS/LNAI, pages 116–130. Springer, 2000.CrossRefGoogle Scholar
  8. 8.
    H. Fernau. Learning of terminal distinguishable languages. In Proceedings AMAI 2000, see
  9. 9.
    H. Fernau. Approximative learning of regular languages. In L. Pacholski and P. Ružička, editors, SOFSEM; Theory and Practice of Informatics, volume 2234 of LNCS, pages 223–232. Springer, 2001.Google Scholar
  10. 10.
    H. Fernau. Learning tree languages from text. Technical Report WSI-2001-19, Universität Tübingen (Germany), Wilhelm-Schickard-Institut für Informatik, 2001.Google Scholar
  11. 11.
    H. Fernau and A. Radl. Algorithms for learning function distinguishable regular languages. In Statistical and Syntactical Methods of Pattern Recognition SPR+SSPR, to appear in the LNCS series. Springer, 2002.Google Scholar
  12. 12.
    C. C. Florêncio. Consistent identification in the limit of any of the classes k-valued is NP-hard. In Proceedings of the Conference on Logical Aspects of Computational Linguistics LACL, volume 2099 of LNCS/LNAI, pages 125–138. Springer, 2001.CrossRefGoogle Scholar
  13. 13.
    E. M. Gold. Language identification in the limit. Information and Control, 10:447–474, 1967.CrossRefzbMATHGoogle Scholar
  14. 14.
    R. C. Gonzalez and M. G. Thomason. Syntactic Pattern Recognition; An Introduction. Addison-Wesley, 1978.Google Scholar
  15. 15.
    C. de la Higuera. Current trends in grammatical inference. In F. J. Ferri et al., editors, Advances in Pattern Recognition, Joint IAPR International Workshops SSPR+SPR, volume 1876 of LNCS, pages 28–31. Springer, 2000.CrossRefGoogle Scholar
  16. 16.
    M. Kanazawa. Learnable Classes of Categorial Grammars. CSLI, 1998.Google Scholar
  17. 17.
    T. Knuutila. How to invent characterizable methods for regular languages. In K. P. Jantke et al., editors, 4th Workshop on Algorithmic Learning Theory ALT, volume 744 of LNCS/LNAI, pages 209–222. Springer, 1993.Google Scholar
  18. 18.
    T. Knuutila and M. Steinby. The inference of tree languages from finite samples: an algebraic approach. Theoretical Computer Science, 129:337–367, 1994.zbMATHCrossRefMathSciNetGoogle Scholar
  19. 19.
    S. Kobayashi and T. Yokomori. Learning approximately regular languages with reversible languages. Theoretical Computer Science, 174(1–2):251–257, 1997.zbMATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    D. López and S. España. Error correcting tree language inference. Pattern Recognition Letters, 23:1–12, 2002.CrossRefGoogle Scholar
  21. 21.
    D. López and I. Piñaga. Syntactic pattern recognition by error correcting analysis on tree automata. In F. J. Ferri et al., editors, Advances in Pattern Recognition, Joint IAPR International Workshops SSPR+SPR, volume 1876 of LNCS, pages 133–142. Springer, 2000.CrossRefGoogle Scholar
  22. 22.
    V. Radhakrishnan and G. Nagaraja. Inference of regular grammars via skeletons. IEEE Transactions on Systems, Man and Cybernetics, 17(6):982–992, 1987.MathSciNetGoogle Scholar
  23. 23.
    J. R. Rico-Juan, J. Calera-Rubio, and R. C. Carrasco. Probabilistic k-testable tree languages. In A. L. Oliveira, editor, Grammatical Inference: Algorithms and Applications, 5th International Colloquium, ICGI, volume 1891 of LNCS/LNAI, pages 221–228. Springer, 2000.Google Scholar
  24. 24.
    G. Rozenberg and A. Salomaa, editors. Handbook of Formal Languages, Volume III. Berlin: Springer, 1997.zbMATHGoogle Scholar
  25. 25.
    Y. Sakakibara. Learning context-free grammars from structural data in polynomial time. Theoretical Computer Science, 76:223–242, 1990.CrossRefMathSciNetzbMATHGoogle Scholar
  26. 26.
    Y. Sakakibara. Efficient learning of context-free grammars from positive structural examples. Information and Computation, 97(1):23–60, March 1992.Google Scholar
  27. 27.
    Y. Sakakibara and H. Muramatsu. Learning context-free grammars from partially structured examples. In A. L. Oliveira, editor, Grammatical Inference: Algorithms and Applications, 5th International Colloquium, ICGI, volume 1891 of LNCS/LNAI, pages 229–240. Springer, 2000.Google Scholar
  28. 28.
    Y. Takada and T. Y. Nishida. A note on grammatical inference of slender context-free languages. In L. Miclet and C. de la Higuera, editors, Proceedings of the Third International Colloquium on Grammatical Inference ICGI: Learning Syntax from Sentences, volume 1147 of LNCS/LNAI, pages 117–125. Springer, 1996.CrossRefGoogle Scholar
  29. 29.
    H. Volger. Grammars with generalized contextfree rules and their tree automata. In Proceedings of Computational Linguists in the Netherlands Meeting CLIN; Selected Papers, pages 223–233. see, 1999.
  30. 30.
    T. Yokomori. Inductive inference of context-free languages based on context-free expressions. International Journal of Computer Mathematics, 24:115–140, 1988.zbMATHCrossRefGoogle Scholar
  31. 31.
    T. Yokomori. Polynomial-time learning of very simple grammars from positive data. In Proceedings 4th Workshop on Computational Learning Theory COLT, pages 213–227, San Mateo, CA, 1991. Morgan Kaufmann.Google Scholar
  32. 32.
    T. Yokomori. On learning systolic languages. In K. P. Jantke, S. Doshita, K. Furukawa, and T. Nishida, editors, Proceedings of the 3rd Workshop on Algorithmic Learning Theory ALT, volume 743 of LNCS/LNAI, pages 41–52. Springer, 1992.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Henning Fernau
    • 1
  1. 1.Department of Computer Science and Software EngineeringUniversity of NewcastleCallaghanAustralia

Personalised recommendations