Learning Juntas in the Presence of Noise

  • Jan Arpe
  • Rüdiger Reischuk
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3959)


The combination of two major challenges in algorithmic learning is investigated: dealing with huge amounts of irrelevant information and learning from noisy data. It is shown that large classes of Boolean concepts that only depend on a small fraction of their variables—so-called juntas—can be learned efficiently from uniformly distributed examples that are corrupted by random attribute and classification noise. We present solutions to cope with the manifold problems that inhibit a straightforward generalization of the noise-free case. Additionally, we extend our methods to non-uniformly distributed examples and derive new results for monotone juntas in this setting. We assume that the attribute noise is generated by a product distribution. Otherwise fault-tolerant learning is in general impossible which follows from the construction of a noise distribution P and a concept class \(\mathcal{C}\) such that it is impossible to learn \(\mathcal{C}\) under P-noise.


Boolean Function Relevant Variable Concept Class Truth Table Irrelevant Information 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Mossel, E., O’Donnell, R., Servedio, R.A.: Learning functions of k relevant variables. J. Comput. System Sci. 69(3), 421–434 (2004)zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Akutsu, T., Miyano, S., Kuhara, S.: Algorithms for Identifying Boolean Networks and Related Biological Networks Based onMatrixMultiplication and Fingerprint Function. J. Comput. Biology 7(3-4), 331–343 (2000)CrossRefGoogle Scholar
  3. 3.
    Blum, A., Langley, P.: Selection of Relevant Features and Examples in Machine Learning. Artificial Intelligence 97(1-2), 245–271 (1997)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Littlestone, N.: Learning Quickly When Irrelevant Attributes Abound: A New Linearthreshold Algorithm. Machine Learning 2(4), 285–318 (1987)Google Scholar
  5. 5.
    Servedio, R.A.: On learning monotone DNF under product distributions. Inform. and Comput. 193(1), 57–74 (2004)zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Angluin, D., Laird, P.D.: Learning From Noisy Examples. Machine Learning 2(4), 343–370 (1988)Google Scholar
  7. 7.
    Shackelford, G., Volper, D.: Learning k-DNF with Noise in the Attributes. In: Proceedings of the 1988 Workshop on Computational Learning Theory, pp. 97–103. MIT, Morgan Kaufmann (1988)Google Scholar
  8. 8.
    Decatur, S.E., Gennaro, R.: On Learning from Noisy and Incomplete Examples. In: Proceedings of the Eigth Annual Conference on Computational Learning Theory (COLT 1995), pp. 353–360. ACM Press, California (1995)CrossRefGoogle Scholar
  9. 9.
    Bshouty, N.H., Jackson, J.C., Tamon, C.: Uniform-distribution attribute noise learnability. Information and Computation 187(2), 277–290 (2003)zbMATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Miyata, A., Tarui, J., Tomita, E.: Learning Boolean Functions in AC0 on Attribute and Classification Noise. In: Ben-David, S., Case, J., Maruoka, A. (eds.) ALT 2004. LNCS (LNAI), vol. 3244, pp. 142–155. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. 11.
    Goldman, S.A., Sloan, R.H.: Can PAC learning algorithms tolerate random attribute noise? Algorithmica 14(1), 70–84 (1995)zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Linial, N., Mansour, Y., Nisan, N.: Constant Depth Circuits, Fourier Transform, and Learnability. J. ACM 40(3), 607–620 (1993)zbMATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58, 13–30 (1963)zbMATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Bernasconi, A.: Mathematical Techniques for the Analysis of Boolean Functions. PhD thesis, Università degli Studi di Pisa, Dipartimento di Ricerca in Informatica (1998)Google Scholar
  15. 15.
    Bahadur, R.R.: A Representation of the Joint Distribution of Responses to n Dichotomous Items. In: Solomon, H. (ed.) Studies in Item Analysis and Prediction, pp. 158–168. Stanford University Press, California (1961)Google Scholar
  16. 16.
    Furst, M.L., Jackson, J.C., Smith, S.W.: Improved Learning of AC0 Functions. In: Valiant, L.G., Warmuth, M.K. (eds.) Proceedings of the Fourth Annual Workshop on Computational Learning Theory (COLT 1991), pp. 317–325. Morgan Kaufmann, San Francisco (1991)Google Scholar
  17. 17.
    Bonami, A.: Étude des coefficients de Fourier des fonctions de lp(g). Ann. Inst. Fourier. 20(2), 335–402 (1970)zbMATHMathSciNetGoogle Scholar
  18. 18.
    Beckner, W.: Inequalities in Fourier Analysis. Ann. of Math (2) 102(1), 159–182 (1975)CrossRefMathSciNetGoogle Scholar
  19. 19.
    Kahn, J., Kalai, G., Linial, N.: The Influence of Variables on Boolean Functions (extended abstract). In: 29th Annual Symposium on Foundations of Computer Science (FOCS ’88), White Plains, New York, IEEE Computer Society Press (1988) 68–80 Google Scholar
  20. 20.
    Benjamini, I., Kalai, G., Schramm, O.: Noise Sensitivity of Boolean Functions and Applications to Percolation. Inst. Hautes Études Sci. Publ. Math. 90, 5–43 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  21. 21.
    Arpe, J.: Learning Juntas in the Presence of Noise. ECCC Report TR05-088 (2005)Google Scholar
  22. 22.
    Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Occam’s razor. Inform. Process. Lett. 24(6), 377–380 (1987)zbMATHCrossRefMathSciNetGoogle Scholar
  23. 23.
    Kolountzakis, M.N., Markakis, E., Mehta, A.: Learning symmetric k-juntas in time no(k). arXiv:math.CO/0504246 v1 (2005),

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jan Arpe
    • 1
  • Rüdiger Reischuk
    • 1
  1. 1.Institut für Theoretische InformatikUniversität zu LübeckLübeckGermany

Personalised recommendations