Skip to main content

On the Learnability of Shuffle Ideals

  • Conference paper
Algorithmic Learning Theory (ALT 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7568))

Included in the following conference series:

Abstract

Although PAC learning unrestricted regular languages is long known to be a very difficult problem, one might suppose the existence (and even an abundance) of natural efficiently learnable sub-families. When our literature search for a natural efficiently learnable regular family came up empty, we proposed the shuffle ideals as a prime candidate. A shuffle ideal generated by a string u is simply the collection of all strings containing u as a (discontiguous) subsequence. This fundamental language family is of theoretical interest in its own right and also provides the building blocks for other important language families. Somewhat surprisingly, we discovered that even a class as simple as the shuffle ideals is not properly PAC learnable, unless RP=NP. In the positive direction, we give an efficient algorithm for properly learning shuffle ideals in the statistical query (and therefore also PAC) model under the uniform distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Angluin, D.: On the complexity of minimum inference of regular sets. Information and Control 3(39), 337–350 (1978)

    Article  MathSciNet  Google Scholar 

  2. Angluin, D.: Inference of reversible languages. Journal of the ACM (JACM) 3(29), 741–765 (1982)

    Article  MathSciNet  Google Scholar 

  3. Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  4. Angluin, D., Slonim, D.K.: Randomly fallible teachers: Learning monotone DNF with an incomplete membership oracle. Machine Learning 14(1), 7–26 (1994)

    MATH  Google Scholar 

  5. Bshouty, N.H.: Exact learning of formulas in parallel. Machine Learning 26(1), 25–41 (1997)

    Article  MATH  Google Scholar 

  6. Bshouty, N.H., Eiron, N.: Learning monotone DNF from a teacher that almost does not answer membership queries. Journal of Machine Learning Research 3, 49–57 (2002)

    MathSciNet  Google Scholar 

  7. Bshouty, N.H., Jackson, J.C., Tamon, C.: Exploring learnability between exact and PAC. J. Comput. Syst. Sci. 70(4), 471–484 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  8. Clark, A., Thollard, F.: Pac-learnability of probabilistic deterministic finite state automata. Journal of Machine Learning Research (JMLR) 5, 473–497 (2004)

    MathSciNet  MATH  Google Scholar 

  9. Cortes, C., Kontorovich, L.(A.), Mohri, M.: Learning Languages with Rational Kernels. In: Bshouty, N.H., Gentile, C. (eds.) COLT. LNCS (LNAI), vol. 4539, pp. 349–364. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  10. de la Higuera, C.: A bibliographical study of grammatical inference. Pattern Recognition 38, 1332–1348 (2005)

    Article  Google Scholar 

  11. Eilenberg, S., Mac Lane, S.: On the groups of H(Π,n). I. Ann. of Math. (2) 58, 55–106 (1953)

    Article  MathSciNet  MATH  Google Scholar 

  12. Mark Gold, E.: Complexity of automaton identification from given data. Information and Control 3(37), 302–420 (1978)

    Article  Google Scholar 

  13. Ishigami, Y., Tani, S.: Vc-dimensions of finite automata and commutative finite automata with k letters and n states. Discrete Applied Mathematics 74(2), 123–134 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  14. Jackson, J.C., Lee, H.K., Servedio, R.A., Wan, A.: Learning random monotone DNF. Discrete Applied Mathematics 159(5), 259–271 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  15. Kearns, M.: Efficient noise-tolerant learning from statistical queries. J. ACM 45(6), 983–1006 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  16. Kearns, M.J., Valiant, L.G.: Cryptographic limitations on learning boolean formulae and finite automata. Journal of the ACM (JACM) 41(1), 67–95 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  17. Kearns, M., Vazirani, U.: An Introduction to Computational Learning Theory. The MIT Press (1997)

    Google Scholar 

  18. Klíma, O., Polák, L.: Hierarchies of piecewise testable languages. Int. J. Found. Comput. Sci. 21(4), 517–533 (2010)

    Article  MATH  Google Scholar 

  19. Kontorovich, L.(A.), Cortes, C., Mohri, M.: Learning Linearly Separable Languages. In: Balcázar, J.L., Long, P.M., Stephan, F. (eds.) ALT 2006. LNCS (LNAI), vol. 4264, pp. 288–303. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  20. Kontorovich, L.(A.), Cortes, C., Mohri, M.: Kernel methods for learning languages. Theor. Comput. Sci. 405(3), 223–236 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  21. Kontorovich, L.(A.), Nadler, B.: Universal Kernel-Based Learning with Applications to Regular Languages. Journal of Machine Learning Research 10, 997–1031 (2009)

    MathSciNet  Google Scholar 

  22. Kontorovich, L.(A.), Ron, D., Singer, Y.: A Markov Model for the Acquisition of Morphological Structure. Technical Report CMU-CS-03-147 (2003)

    Google Scholar 

  23. Koskenniemi, K.: Two-level model for morphological analysis. In: IJCAI, pp. 683–685 (1983)

    Google Scholar 

  24. Lothaire, M.: Combinatorics on Words. Encyclopedia of Mathematics and Its Applications, vol. 17. Addison-Wesley (1983)

    Google Scholar 

  25. Mohri, M.: On some applications of finite-state automata theory to natural language processing. Nat. Lang. Eng. 2, 61–80 (1996)

    Article  Google Scholar 

  26. Mohri, M.: Finite-state transducers in language and speech processing. Computational Linguistics 23(2), 269–311 (1997)

    MathSciNet  Google Scholar 

  27. Mohri, M., Moreno, P., Weinstein, E.: Efficient and robust music identification with weighted finite-state transducers. IEEE Transactions on Audio, Speech & Language Processing 18(1), 197–207 (2010)

    Article  Google Scholar 

  28. Mohri, M., Pereira, F., Riley, M.: Weighted finite-state transducers in speech recognition. Computer Speech & Language 16(1), 69–88 (2002)

    Article  Google Scholar 

  29. Oncina, J., García, P.: Identifying regular languages in polynomial time. In: Advances in Structural and Syntactic Pattern Recognition, pp. 49–61. World Scientific Publishing (1992)

    Google Scholar 

  30. Palmer, N., Goldberg, P.W.: PAC-learnability of probabilistic deterministic finite state automata in terms of variation distance. Theor. Comput. Sci. 387(1), 18–31 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  31. Parekh, R., Honavar, V.G.: Learning DFA from simple examples. Mach. Learn. 44(1-2), 9–35 (2001)

    Article  MATH  Google Scholar 

  32. Păun, G.: Mathematical Aspects of Natural and Formal Languages. World Scientific Publishing (1994)

    Google Scholar 

  33. Pitt, L., Warmuth, M.: Prediction-preserving reducibility. Journal of Computer and System Sciences 41(3), 430–467 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  34. Pitt, L., Warmuth, M.: The minimum consistent DFA problem cannot be approximated within any polynomial. Journal of the Association for Computing Machinery 40(1), 95–142 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  35. Rambow, O., Bangalore, S., Butt, T., Nasr, A., Sproat, R.: Creating a finite-state parser with application semantics. In: COLING (2002)

    Google Scholar 

  36. Ron, D., Singer, Y., Tishby, N.: On the learnability and usage of acyclic probabilistic finite automata. Journal of Computer and System Sciences 56(2), 133–152 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  37. Sellie, L.: Learning random monotone DNF under the uniform distribution. In: COLT, pp. 181–192 (2008)

    Google Scholar 

  38. Servedio, R.A.: On learning monotone DNF under product distributions. Inf. Comput. 193(1), 57–74 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  39. Simon, I.: Piecewise Testable Events. In: Brakhage, H. (ed.) GI-Fachtagung 1975. LNCS, vol. 33, pp. 214–222. Springer, Heidelberg (1975)

    Google Scholar 

  40. Sproat, R., Shih, C., Gale, W., Chang, N.: A stochastic finite-state word-segmentation algorithm for Chinese. Computational Linguistics 22(3), 377–404 (1996)

    Google Scholar 

  41. Valiant, L.G.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Angluin, D., Aspnes, J., Kontorovich, A. (2012). On the Learnability of Shuffle Ideals. In: Bshouty, N.H., Stoltz, G., Vayatis, N., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2012. Lecture Notes in Computer Science(), vol 7568. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34106-9_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34106-9_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34105-2

  • Online ISBN: 978-3-642-34106-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics