Learning Unions of k-Testable Languages

Linard, Alexis; de la Higuera, Colin; Vaandrager, Frits

doi:10.1007/978-3-030-13435-8_24

Alexis Linard¹⁵,
Colin de la Higuera¹⁶ &
Frits Vaandrager¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11417))

Included in the following conference series:

International Conference on Language and Automata Theory and Applications

514 Accesses
4 Citations

Abstract

A classical problem in grammatical inference is to identify a language from a set of examples. In this paper, we address the problem of identifying a union of languages from examples that belong to several different unknown languages. Indeed, decomposing a language into smaller pieces that are easier to represent should make learning easier than aiming for a too generalized language. In particular, we consider k-testable languages in the strict sense (k-TSS). These are defined by a set of allowed prefixes, infixes (sub-strings) and suffixes that words in the language may contain. We establish a Galois connection between the lattice of all languages over alphabet \(\varSigma \), and the lattice of k-TSS languages over \(\varSigma \). We also define a simple metric on k-TSS languages. The Galois connection and the metric allow us to derive an efficient algorithm to learn the union of k-TSS languages. We evaluate our algorithm on an industrial dataset and thus demonstrate the relevance of our approach.

This research is supported by the Dutch Technology Foundation (STW) under the Robust CPS program (project 12693).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
For missing proofs, see http://arxiv.org/abs/1812.08269.
2.
See https://gitlab.science.ru.nl/alinard/learning-union-ktss.

References

Benzécri, J.P.: Construction d’une classification ascendante hiérarchique par la recherche en chaîne des voisins réciproques. Les cahiers de l’analyse des données 7(2), 209–218 (1982)
MATH Google Scholar
Bex, G.J., Neven, F., Schwentick, T., Tuyls, K.: Inference of concise DTDs from XML data. In: Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 115–126 (2006)
Google Scholar
Coste, F.: Learning the language of biological sequences. In: Heinz, J., Sempere, J.M. (eds.) Topics in Grammatical Inference, pp. 215–247. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-48395-4_8
Chapter Google Scholar
García, P., Vidal, E.: Inference of k-testable languages in the strict sense and application to syntactic pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12(9), 920–925 (1990)
Article Google Scholar
Garcia, P., Vidal, E., Oncina, J.: Learning locally testable languages in the strict sense. In: First International Workshop Algorithmic Learning Theory (ALT), pp. 325–338 (1990)
Google Scholar
Gold, M.: Language identification in the limit. Inf. Control 10(5), 447–474 (1967)
Article MathSciNet Google Scholar
de la Higuera, C.: Grammatical Inference: Learning Automata and Grammars. Cambridge University Press, Cambridge (2010)
Book Google Scholar
Linard, A.: Learning several languages from labeled strings: state merging and evolutionary approaches. arXiv preprint arXiv:1806.01630 (2018)
Linard, A., Smetsers, R., Vaandrager, F., Waqas, U., van Pinxten, J., Verwer, S.: Learning pairwise disjoint simple languages from positive examples. arXiv preprint arXiv:1706.01663 (2017)
McNaughton, R., Papert, S.A.: Counter-Free Automata (M.I.T. Research Monograph No. 65). The MIT Press (1971)
Google Scholar
Nielson, F., Nielson, H., Hankin, C.: Principles of Program Analysis. Springer, Heidelberg (1999). https://doi.org/10.1007/978-3-662-03811-6
Book MATH Google Scholar
Rogers, J., Pullum, G.K.: Aural pattern recognition experiments and the subregular hierarchy. J. Log. Lang. Inf. 20(3), 329–342 (2011)
Article MathSciNet Google Scholar
Tantini, F., Terlutte, A., Torre, F.: Sequences classification by least general generalisations. In: Sempere, J.M., García, P. (eds.) ICGI 2010. LNCS (LNAI), vol. 6339, pp. 189–202. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15488-1_16
Chapter Google Scholar
Torres, I., Varona, A.: k-TSS language models in speech recognition systems. Comput. Speech Lang. 15(2), 127–148 (2001)
Article Google Scholar
Umar, W., et al.: A fast estimator of performance with respect to the design parameters of self re-entrant flowshops. In: Euromicro Conference on Digital System Design, pp. 215–221 (2016)
Google Scholar
Yokomori, T., Kobayashi, S.: Learning local languages and their application to dna sequence analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(10), 1067–1079 (1998)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Computing and Information Science, Radboud University, Nijmegen, The Netherlands
Alexis Linard & Frits Vaandrager
Laboratoire des Sciences du Numérique de Nantes, Université de Nantes, Nantes, France
Colin de la Higuera

Authors

Alexis Linard
View author publications
You can also search for this author in PubMed Google Scholar
Colin de la Higuera
View author publications
You can also search for this author in PubMed Google Scholar
Frits Vaandrager
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexis Linard .

Editor information

Editors and Affiliations

Rovira i Virgili University, Tarragona, Spain
Carlos Martín-Vide
Saint Petersburg State University, St. Petersburg, Russia
Alexander Okhotin
Ariel University, Ariel, Israel
Dana Shapira

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Linard, A., de la Higuera, C., Vaandrager, F. (2019). Learning Unions of k-Testable Languages. In: Martín-Vide, C., Okhotin, A., Shapira, D. (eds) Language and Automata Theory and Applications. LATA 2019. Lecture Notes in Computer Science(), vol 11417. Springer, Cham. https://doi.org/10.1007/978-3-030-13435-8_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-13435-8_24
Published: 14 February 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-13434-1
Online ISBN: 978-3-030-13435-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics