Abstract
We study the evolution of monotone conjunctions using local search; the fitness function that guides the search is correlation with Boolean loss. Building on the work of Diochnos and Turán [6], we generalize Valiant’s algorithm [19] for the evolvability of monotone conjunctions from the uniform distribution \({\mathcal U}_n\) to binomial distributions \({\mathcal B}_n\).
With a drilling technique, for a frontier q, we exploit a structure theorem for best q-approximations. We study the algorithm using hypotheses from their natural representation (\(\mathcal H = \mathcal C \)), as well as when hypotheses contain at most q variables (\(\mathcal H = \mathcal C _{\le q}\)). Our analysis reveals that \({\mathcal U}_n\) is a very special case in the analysis of binomial distributions with parameter p, where \(p\in \mathcal {F} = \{2^{-1/k} \ | \ k\in \mathbb N ^*\}\). On instances of dimension n, we study approximate learning for \(0< p < 2^{-\frac{1}{n-1}}\) when \(\mathcal H = \mathcal C \) and for \(0< p < \root n-1 \of {2/3}\) when \(\mathcal H = \mathcal C _{\le q}\). Thus, in either case, approximate learning can be achieved for any \(0< p < 1\), for sufficiently large n.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
As \(\text {h}\) will be clear from the context, we write N instead of \(N(\text {h})\).
- 2.
As p ranges in (0, 1), a natural question in (12) is whether \(\upzeta \in \mathbb Z \); then \(q_m = \upzeta + 1\), otherwise \(q_m = \lceil \upzeta \rceil < \upzeta + 1\). Equivalently, does \(2p^{\upzeta + \vartheta } - A = 0\) hold for \(\upzeta \in \mathbb Z \)? By Table 1 and the definition of \(\vartheta \), for \({\mathcal U}_n\), \(\upzeta = 1\). Hence, in \({\mathcal U}_n\), when \(\frac{3}{2} \le \varepsilon < 3\) then the two quantities for \(\upmu \) in (8) have the same value for a range of \(\varepsilon \) values. Regardless if there are additional integer solutions, \(q_m\) can be computed efficiently.
- 3.
A clarification comment is in order here. When \(\mathcal H = \mathcal C \), by Table 2, \(q \ge \lceil \log _{\frac{1}{p}}(2)\rceil \ge \log _{\frac{1}{p}}(2) = k\) always, and thus, on an instance of dimension n, as p increases in \(\mathcal {F}\) for successive values of k, then q increases at least that fast. The above is not necessarily true when \(\mathcal H = \mathcal C _{\le q}\). By (11), when \(p\in \mathcal {F}\) with \(k \ge 3\) (i.e. \(p\in \mathcal {F}\) and \(p \ge 2^{-1/3}\)), for input \(\varepsilon \) such that \(2 > \varepsilon \ge 3p^{k-1} = \frac{3}{2p}\), then \(q = \lceil \log _{\frac{1}{p}}(\frac{3}{\varepsilon })\rceil < \lfloor \log _{\frac{1}{p}}(2)\rfloor = \log _{\frac{1}{p}}(2) = k\). However, these distributions and input errors are irrelevant to our discussion as for \(\left|\text {c} \right| \le q\), \(U = p^u \ge p^q \ge p^{k-1} = \frac{1}{2p} > \frac{1}{2}\).
- 4.
Diochnos and Turán in [6] gave a bound of 2q for \({\mathcal U}_n\). \({\mathcal U}_n\) is once again special, because \(p = \frac{1}{2}\) is the unique member of \(\mathcal {F}\) where in the shrinking phase (Fig. 1(c)), \(U > \frac{1}{2} \Rightarrow U = 1 \Rightarrow u = 0\); that is, one needs to argue only about specializations of the target. For \(p < \frac{1}{2}\), Fig. 1(b) never applies, Fig. 1(c) is again about specializations of the target, and then we can match their 2q bound. However, we use 3q throughout for uniformity in the analysis.
- 5.
This example reveals another aspect of our approach. There are cases where \(q + \vartheta \ge n\), even when \(\mathcal H = \mathcal C \). Then, our method is powerful enough to perform exact learning (there are no long targets). However, only an approximation of the target will be returned, satisfying \(\text {Perf}_{{\mathcal B}_n}\left( \text {h} , \text {c} \right) > 1 - \varepsilon \). On the other hand, one can improve the definitions of \(\vartheta \) in Table 2 and in (11) by setting \(\vartheta = \min \{n - q, \lfloor \log _{1/p}(2)\rfloor \}\); we did not do so for simplicity in the presentation.
- 6.
Also, p can be arbitrarily close to 1. For \(k\in \mathbb N ^*\), \(p = 2^{-\frac{1}{k}} \Rightarrow \vartheta = k\). Let, \(\varepsilon = \frac{3}{4} \Rightarrow q = \lceil \log _{\frac{1}{p}}(4)\rceil = 2k\). Then, for \(n \ge 3k\), we look at the conjunction with size 3k.
References
Ajtai, M., Feldman, V., Hassidim, A., Nelson, J.: Sorting and selection with imprecise comparisons. ACM Trans. Algorithms 12(2), 19 (2016)
Angelino, E., Kanade, V.: Attribute-efficient evolvability of linear functions. In: ITCS, pp. 287–300 (2014)
Angluin, D., Laird, P.D.: Learning from noisy examples. Mach. Learn. 2(4), 343–370 (1987)
Aslam, J.A., Decatur, S.E.: Specification and simulation of statistical query algorithms for efficiency and noise tolerance. J. Comput. Syst. Sci. 56(2), 191–208 (1998)
Bshouty, N.H., Feldman, V.: On using extended statistical queries to avoid membership queries. J. Mach. Learn. Res. 2, 359–395 (2002)
Diochnos, D.I., Turán, G.: On evolvability: the swapping algorithm, product distributions, and covariance. In: SAGA, pp. 74–88 (2009)
Droste, S., Jansen, T., Wegener, I.: On the analysis of the (1+1) evolutionary algorithm. Theor. Comput. Sci. 276(1–2), 51–81 (2002)
Feldman, V.: Evolvability from learning algorithms. In: STOC, pp. 619–628 (2008)
Feldman, V.: Robustness of Evolvability. In: COLT, pp. 277–292 (2009)
Feldman, V.: Distribution-independent evolvability of linear threshold functions. In: COLT, pp. 253–272 (2011)
Feldman, V.: A complete characterization of statistical query learning with applications to evolvability. J. Comput. Syst. Sci. 78(5), 1444–1459 (2012)
Goldman, S.A., Sloan, R.H.: Can PAC learning algorithms tolerate random attribute noise? Algorithmica 14(1), 70–84 (1995)
Kanade, V.: Evolution with recombination. In: FOCS, pp. 837–846 (2011)
Kanade, V., Valiant, L.G., Vaughan, J.W.: Evolution with drifting targets. In: COLT, pp. 155–167 (2010)
Kearns, M.J.: Efficient noise-tolerant learning from statistical queries. In: STOC, pp. 392–401 (1993)
Michael, L.: Evolvability via the Fourier transform. Theor. Comput. Sci. 462, 88–98 (2012)
Szörényi, B.: Characterizing statistical query learning: simplified notions and proofs. In: ALT, pp. 186–200 (2009)
Valiant, L.G.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984)
Valiant, L.G.: Evolvability. In: Kučera, L., Kučera, A. (eds.) MFCS 2007. LNCS, vol. 4708, pp. 22–43. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74456-6_5
Valiant, P.: Distribution free evolvability of polynomial functions over all convex loss functions. In: ITCS, pp. 142–148 (2012)
Valiant, P.: Evolvability of real functions. ACM Trans. Comput. Theor. 6(3), 12:1–12:19 (2014)
Wegener, I.: Theoretical aspects of evolutionary algorithms. In: Loeckx, J. (ed.) ICALP 1974. LNCS, vol. 14, pp. 64–78. Springer, Heidelberg (2001). doi:10.1007/3-540-48224-5_6
Acknowledgement
The author would like to thank György Turán for fruitful discussions on earlier versions of the paper. The author would also like to thank Yanjun Qi and Elias Tsigaridas for some additional interesting discussions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Diochnos, D.I. (2016). On the Evolution of Monotone Conjunctions: Drilling for Best Approximations. In: Ortner, R., Simon, H., Zilles, S. (eds) Algorithmic Learning Theory. ALT 2016. Lecture Notes in Computer Science(), vol 9925. Springer, Cham. https://doi.org/10.1007/978-3-319-46379-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-46379-7_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46378-0
Online ISBN: 978-3-319-46379-7
eBook Packages: Computer ScienceComputer Science (R0)