Skip to main content

Learning Unions of k-Testable Languages

  • Conference paper
  • First Online:
Language and Automata Theory and Applications (LATA 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11417))

Abstract

A classical problem in grammatical inference is to identify a language from a set of examples. In this paper, we address the problem of identifying a union of languages from examples that belong to several different unknown languages. Indeed, decomposing a language into smaller pieces that are easier to represent should make learning easier than aiming for a too generalized language. In particular, we consider k-testable languages in the strict sense (k-TSS). These are defined by a set of allowed prefixes, infixes (sub-strings) and suffixes that words in the language may contain. We establish a Galois connection between the lattice of all languages over alphabet \(\varSigma \), and the lattice of k-TSS languages over \(\varSigma \). We also define a simple metric on k-TSS languages. The Galois connection and the metric allow us to derive an efficient algorithm to learn the union of k-TSS languages. We evaluate our algorithm on an industrial dataset and thus demonstrate the relevance of our approach.

This research is supported by the Dutch Technology Foundation (STW) under the Robust CPS program (project 12693).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For missing proofs, see http://arxiv.org/abs/1812.08269.

  2. 2.

    See https://gitlab.science.ru.nl/alinard/learning-union-ktss.

References

  1. Benzécri, J.P.: Construction d’une classification ascendante hiérarchique par la recherche en chaîne des voisins réciproques. Les cahiers de l’analyse des données 7(2), 209–218 (1982)

    MATH  Google Scholar 

  2. Bex, G.J., Neven, F., Schwentick, T., Tuyls, K.: Inference of concise DTDs from XML data. In: Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 115–126 (2006)

    Google Scholar 

  3. Coste, F.: Learning the language of biological sequences. In: Heinz, J., Sempere, J.M. (eds.) Topics in Grammatical Inference, pp. 215–247. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-48395-4_8

    Chapter  Google Scholar 

  4. García, P., Vidal, E.: Inference of k-testable languages in the strict sense and application to syntactic pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12(9), 920–925 (1990)

    Article  Google Scholar 

  5. Garcia, P., Vidal, E., Oncina, J.: Learning locally testable languages in the strict sense. In: First International Workshop Algorithmic Learning Theory (ALT), pp. 325–338 (1990)

    Google Scholar 

  6. Gold, M.: Language identification in the limit. Inf. Control 10(5), 447–474 (1967)

    Article  MathSciNet  Google Scholar 

  7. de la Higuera, C.: Grammatical Inference: Learning Automata and Grammars. Cambridge University Press, Cambridge (2010)

    Book  Google Scholar 

  8. Linard, A.: Learning several languages from labeled strings: state merging and evolutionary approaches. arXiv preprint arXiv:1806.01630 (2018)

  9. Linard, A., Smetsers, R., Vaandrager, F., Waqas, U., van Pinxten, J., Verwer, S.: Learning pairwise disjoint simple languages from positive examples. arXiv preprint arXiv:1706.01663 (2017)

  10. McNaughton, R., Papert, S.A.: Counter-Free Automata (M.I.T. Research Monograph No. 65). The MIT Press (1971)

    Google Scholar 

  11. Nielson, F., Nielson, H., Hankin, C.: Principles of Program Analysis. Springer, Heidelberg (1999). https://doi.org/10.1007/978-3-662-03811-6

    Book  MATH  Google Scholar 

  12. Rogers, J., Pullum, G.K.: Aural pattern recognition experiments and the subregular hierarchy. J. Log. Lang. Inf. 20(3), 329–342 (2011)

    Article  MathSciNet  Google Scholar 

  13. Tantini, F., Terlutte, A., Torre, F.: Sequences classification by least general generalisations. In: Sempere, J.M., García, P. (eds.) ICGI 2010. LNCS (LNAI), vol. 6339, pp. 189–202. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15488-1_16

    Chapter  Google Scholar 

  14. Torres, I., Varona, A.: k-TSS language models in speech recognition systems. Comput. Speech Lang. 15(2), 127–148 (2001)

    Article  Google Scholar 

  15. Umar, W., et al.: A fast estimator of performance with respect to the design parameters of self re-entrant flowshops. In: Euromicro Conference on Digital System Design, pp. 215–221 (2016)

    Google Scholar 

  16. Yokomori, T., Kobayashi, S.: Learning local languages and their application to dna sequence analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(10), 1067–1079 (1998)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexis Linard .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Linard, A., de la Higuera, C., Vaandrager, F. (2019). Learning Unions of k-Testable Languages. In: Martín-Vide, C., Okhotin, A., Shapira, D. (eds) Language and Automata Theory and Applications. LATA 2019. Lecture Notes in Computer Science(), vol 11417. Springer, Cham. https://doi.org/10.1007/978-3-030-13435-8_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-13435-8_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-13434-1

  • Online ISBN: 978-3-030-13435-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics