Skip to main content
Log in

A new central limit theorem and decomposition for Gaussian polynomials, with an application to deterministic approximate counting

  • Published:
Probability Theory and Related Fields Aims and scope Submit manuscript

Abstract

One of the main results of this paper is a new multidimensional central limit theorem (CLT) for multivariate polynomials under Gaussian inputs. Roughly speaking, the new CLT shows that any collection of Gaussian polynomials with small eigenvalues (suitably defined) must have a joint distribution which is close to a multidimensional Gaussian distribution. The new CLT is proved using tools from Malliavin calculus and Stein’s method. A second main result of the paper, which complements the new CLT, is a new decomposition theorem for low-degree multilinear polynomials over Gaussian inputs. Roughly speaking, this result shows that (up to some small error) any such polynomial is very close to a polynomial which can be decomposed into a bounded number of multilinear polynomials all of which have extremely small eigenvalues. An important feature of this decomposition theorem is the delicate control obtained between the number of polynomials in the decomposition versus their eigenvalues. As the main application of these results, we give a deterministic algorithm for approximately counting satisfying assignments of a degree-d polynomial threshold function (PTF) over the domain \(\{-1,1\}^n\); this is a well-studied problem from theoretical computer science. More precisely, given as input a degree-d polynomial \(p(x_1,\dots ,x_n)\) over \({{\mathbb {R}}}^n\) and a parameter \(\epsilon > 0\), the algorithm approximates

$$\begin{aligned} \mathop {\mathbf{Pr}}_{x \sim \{-1,1\}^n}[p(x) \ge 0] \end{aligned}$$

to within an additive \(\pm \epsilon \) in time \(O_{d,\epsilon }(1)\cdot \mathrm {poly}(n^d)\). (Since it is NP-hard to determine whether the above probability is nonzero, any sort of efficient multiplicative approximation is almost certainly impossible even for randomized algorithms.) Note that the running time of the algorithm (as a function of \(n^d\), the number of coefficients of a degree-d PTF) is a fixed polynomial. The fastest previous algorithm for this problem (Kane, CoRR, arXiv:1210.1280, 2012), based on constructions of unconditional pseudorandom generators for degree-d PTFs, runs in time \(n^{O_{d,c}(1) \cdot \epsilon ^{-c}}\) for all \(c > 0\).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

Notes

  1. (Note that since we are only dealing with symmetric tensors we could equivalently have considered only partitions into \([1,\dots ,k]\), \([k+1,\dots ,p]\) where \(1 \le k \le p-1.\))

  2. Note that the reason this problem did not arise in the degree-2 polynomial decompositions of [14] is because each polynomial \(P_i,Q_i\) obtained from Decompose-One-Wiener in that setting must have degree 1 (the only way to break the number 2 into a sum of non-negative integers is as 1+1). Degree-1 polynomials may be viewed as having “perfect eigenregularity” (note that any degree-1 polynomial in Gaussian variables is itself distributed precisely as a Gaussian) and so having any number of such degree-1 polynomials did not pose a problem in [14].

  3. BPP is the class of randomized polynomial time algorithms which err on any given input with probability (say) at most 1 / 3. It is not known whether NP \(\subseteq \) BPP but this is viewed as very unlikely, see e.g. [1].

  4. A similar “multilinearization” procedure is analyzed in [31], but since the setting and required guarantees are somewhat different here we give a self-contained algorithm and analysis.

References

  1. Aaronson, S.: P \(\stackrel{?}{=}\) NP. http://www.scottaaronson.com/papers/pnp.pdf. Earlier version in “Open Problems in Mathematics”, Nash, J.F., Jr., Rassias, M.Th. (eds.) (2017)

  2. Alon, N., Babai, L., Itai, A.: A fast and simple randomized algorithm for the maximal independent set problem. J. Algorithms 7, 567–583 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  3. Aziz, H., Paterson, M., Leech, D.: Efficient algorithm for designing weighted voting games. In: IEEE International Multitopic Conference, pp. 1–6 (2007)

  4. Ajtai, M., Wigderson, A.: Deterministic simulation of probabilistic constant depth circuits. In: Proceedings of 26th IEEE Symposium on Foundations of Computer Science (FOCS), pp. 11–19 (1985)

  5. Adamczak, R., Wolff, P.: Concentration inequalities for non-Lipschitz functions with bounded derivatives of higher order. Probab. Theory Relat. Fields 162(3), 531–586 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  6. Bhatia, R.: Matrix Analysis. Springer, Basel (2000)

    MATH  Google Scholar 

  7. Breuer, P., Major, P.: Central limit theorems for non-linear functionals of Gaussian fields. J. Multivar. Anal. 13(3), 425–441 (1983)

    Article  MATH  Google Scholar 

  8. Chatterjee, S.: A new method of normal approximation. Ann. Probab. 36(4), 1584–1610, 07 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  9. Chatterjee, S.: Fluctuations of eigenvalues and second-order Poincaré inequalities. Probab. Theory Relat. Fields 143, 1–40 (2009)

    Article  MATH  Google Scholar 

  10. Cartwright, D., Sturmfels, B.: The number of eigenvalues of a tensor. Linear Algebra Appl. 432(2), 942–952 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  11. Carbery, A., Wright, J.: Distributional and \(L^q\) norm inequalities for polynomials over convex bodies in \(R^n\). Math. Res. Lett. 8(3), 233–248 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  12. De, A., Diakonikolas, I., Feldman, V., Servedio, R.: Near-optimal solutions for the Chow parameters problem and low-weight approximation of halfspaces. In: Proceedings of 44th ACM Symposium on Theory of Computing (STOC), pp. 729–746 (2012)

  13. De, A., Diakonikolas, I., Servedio, R.A.: The inverse Shapley value problem. ICALP 1, 266–277 (2012)

    MathSciNet  MATH  Google Scholar 

  14. De, A., Diakonikolas, I., Servedio, R.: Deterministic approximate counting for degree-2 polynomial threshold functions. Manuscript (2013)

  15. De, A., Diakonikolas, I., Servedio, R.: Deterministic approximate counting for juntas of degree-2 polynomial threshold functions. Manuscript (2013)

  16. Diakonikolas, I., Harsha, P., Klivans, A., Meka, R., Raghavendra, P., Servedio, R.A., Tan, L.-Y.: Bounding the average sensitivity and noise sensitivity of polynomial threshold functions. In: STOC, pp. 533–542 (2010)

  17. Diakonikolas, I., Kane, D.M., Nelson, J.: Bounded independence fools degree-2 threshold functions. In: Proceedings of 51st IEEE Symposium on Foundations of Computer Science (FOCS), pp. 11–20 (2010)

  18. Dowson, D.C., Landau, B.V.: The Frechet distance between multivariate normal distributions. J. Multivar. Anal. 12(3), 450–455 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  19. Diakonikolas, I., O’Donnell, R., Servedio, R., Wu, Y.: Hardness results for agnostically learning low-degree polynomial threshold functions. In: SODA, pp. 1590–1606 (2011)

  20. Diakonikolas, I., Servedio, R., Tan, L.-Y., Wan, A.: A regularity lemma, and low-weight approximators, for low-degree polynomial threshold functions. In: CCC, pp. 211–222 (2010)

  21. Feller, W.: An Introduction to Probability Theory and Its Applications. Wiley, London (1968)

    MATH  Google Scholar 

  22. Friedman, J., Wigderson, A.: On the second eigenvalue of hypergraphs. Combinatorica 15(1), 43–65 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  23. Goldmann, M., Håstad, J., Razborov, A.: Majority gates vs. general weighted threshold gates. Comput. Complex. 2, 277–300 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  24. Gopalan, P., Klivans, A., Meka, R., Stefankovic, D., Vempala, S., Vigoda, E.: An fptas for #knapsack and related counting problems. In: FOCS, pp. 817–826 (2011)

  25. Golub, G., Van Loan, C.F.: Matrix Computations. The Johns Hopkins University Press, Baltimore (1996)

    MATH  Google Scholar 

  26. Gopalan, P., Meka, R., Reingold, O.: DNF sparsification and a faster deterministic counting algorithm. Comput. Complex. 22(2), 275–310 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  27. Gopalan, P., O’Donnell, R., Wu, Y., Zuckerman, D.: Fooling functions of halfspaces under product distributions. In: IEEE Conference on Computational Complexity (CCC), pp. 223–234 (2010)

  28. Håstad, J.: On the size of weights for threshold gates. SIAM J. Discret. Math. 7(3), 484–492 (1994)

    Article  MathSciNet  Google Scholar 

  29. Janson, S.: Gaussian Hilbert Spaces. Cambridge University Press, Cambridge (1997)

    Book  MATH  Google Scholar 

  30. Kane, D.M.: The Gaussian surface area and noise sensitivity of degree-d polynomial threshold functions. In: CCC, pp. 205–210 (2010)

  31. Kane, D.M.: k-independent gaussians fool polynomial threshold functions. In: IEEE Conference on Computational Complexity, pp. 252–261 (2011)

  32. Kane, D.M.: A small PRG for polynomial threshold functions of gaussians. In: FOCS, pp. 257–266 (2011)

  33. Kane, D.M.: The correct exponent for the Gotsman–Linial conjecture. arXiv:1210.1283 (2012)

  34. Kane, D.M.: A pseudorandom generator for polynomial threshold functions of gaussian with subpolynomial seed length. arXiv:1210.1280 (2012)

  35. Kane, D.M.: A structure theorem for poorly anticoncentrated gaussian chaoses and applications to the study of polynomial threshold functions. In: FOCS, pp. 91–100 (2012)

  36. Kalai, A., Klivans, A., Mansour, Y., Servedio, R.: Agnostically learning halfspaces. SIAM J. Comput. 37(6), 1777–1805 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  37. Karnin, Z.S., Rabani, Y., Shpilka, A.: Explicit dimension reduction and its applications. SIAM J. Comput. 41(1), 219–249 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  38. Latala, R.: Estimates of moments and tails of gaussian chaoses. Ann. Probab. 34(6), 2315–2331 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  39. Latala, R.: Personal communication (2013)

  40. Ledoux, M.: Personal communication (2013)

  41. Luby, M., Velickovic, B.: On deterministic approximation of DNF. Algorithmica 16(4/5), 415–433 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  42. Luby, M., Velickovic, B., Wigderson, A.: Deterministic approximate counting of depth-2 circuits. In: Proceedings of the 2nd ISTCS, pp. 18–24 (1993)

  43. Myhill, J., Kautz, W.: On the size of weights required for linear-input switching functions. IRE Trans. Electron. Comput. EC10(2), 288–290 (1961)

    Article  Google Scholar 

  44. Mossel, E., O’Donnell, R., Oleszkiewicz, K.K.: Noise stability of functions with low influences: invariance and optimality. Ann. Math. 171, 295–341 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  45. Minsky, M., Papert, S.: Perceptrons: An Introduction to Computational Geometry. MIT Press, Cambridge (1968)

    MATH  Google Scholar 

  46. Muroga, S., Toda, I., Takasu, S.: Theory of majority switching elements. J. Frankl. Inst. 271, 376–418 (1961)

    Article  MATH  Google Scholar 

  47. Muroga, S.: Threshold Logic and Its Applications. Wiley-Interscience, New York (1971)

    MATH  Google Scholar 

  48. Meka, R., Zuckerman, D.: Pseudorandom generators for polynomial threshold functions. http://arxiv.org/abs/0910.4122 (2009)

  49. Meka, R., Zuckerman, D.: Pseudorandom generators for polynomial threshold functions. In: STOC, pp. 427–436 (2010)

  50. Naor, J., Naor, M.: Small-bias probability spaces: efficient constructions and applications. SIAM J. Comput. 22(4), 838–856 (1993). (Earlier version in STOC’90)

    Article  MathSciNet  MATH  Google Scholar 

  51. Nourdin, I.: Lectures on gaussian approximations with Malliavin calculus. Technical report. http://arxiv.org/abs/1203.4147v3, 28 June 2012

  52. Nourdin, I.: Personal communication (2013)

  53. Nualart, D., Peccati, G.: Central limit theorems for sequences of multiple stochastic integrals. Ann. Probab. 33(1), 177–193 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  54. Nourdin, I., Peccati, G.: Stein’s method meets Malliavin calculus: a short survey with new estimates. Technical report. http://arxiv.org/abs/0906.4419v2, 17 Sep 2009

  55. Nourdin, I., Peccati, G., Réveillac, A.: Multivariate normal approximation using Stein’s method and Malliavin calculus. Ann. Inst. H. Poincaré Probab. Stat. 46(1), 45–58 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  56. Oleszkiewicz, K.: Personal communication (2013)

  57. Orponen, P.: Neural networks and complexity theory. In: Proceedings of the 17th International Symposium on Mathematical Foundations of Computer Science, pp. 50–61 (1992)

  58. Podolskii, V.V.: Perceptrons of large weight. Probl. Inf. Transm. 45(1), 46–53 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  59. Servedio, R.: Every linear threshold function has a low-weight approximator. Comput. Complex. 16(2), 180–209 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  60. Sherstov, A.A.: Halfspace matrices. Comput. Complex. 17(2), 149–178 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  61. Sherstov, A.: The intersection of two halfspaces has high threshold degree. In: Proceedings of 50th IEEE Symposium on Foundations of Computer Science (FOCS) (2009)

  62. Shalev-Shwartz, S., Shamir, O., Sridharan, K.: Learning kernel-based halfspaces with the 0–1 loss. SIAM J. Comput. 40(6), 1623–1646 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  63. Trevisan, L.: A note on approximate counting for \(k\)-DNF. In: Proceedings of the Eighth International Workshop on Randomization and Computation, pp. 417–426 (2004)

  64. Viola, E.: The sum of \(d\) small-bias generators fools polynomials of degree \(d\). Comput. Complex. 18(2), 209–217 (2009)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We thank Ilias Diakonikolas for his contributions in the early stages of this project. We also thank Rafal Latala, Michel Ledoux, Elchanan Mossel, Ivan Nourdin and Krzysztof Oleszkiewicz for answering questions about the CLT. Part of this work was done when A.D. was hosted by Oded Regev and the Simons Institute. A.D. would like to thank them for their kind hospitality and support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anindya De.

Additional information

Anindya De: Work was partly done while the author was hosted by Oded Regev at NYU and partly while the author was a fellow at the Simons Institute, Berkeley. Partly supported by NSF Grant CCF-1320188. Rocco A. Servedio: Supported by NSF Grants CCF-1115703 and CCF-1319788.

Appendix: A dealing with non-multilinear polynomials

Appendix: A dealing with non-multilinear polynomials

The decomposition procedure that we use relies heavily on the fact that the input polynomials \(p_i\) are multilinear. To handle general (non-multilinear) degree-d polynomials, the first step of our algorithm is to transform them to (essentially) equivalent multilinear degree-d polynomials. This is accomplished by a simple procedure whose performance is described in the following theorem.Footnote 4 Note that given Theorem 55, we assume that the polynomial p given as input in Theorem 3 is multilinear.

Theorem 55

There is a deterministic procedure Multilinearize with the following properties: The algorithm takes as input a (not necessarily multilinear) variance-1 degree-d polynomial p over \({{\mathbb {R}}}^n\) and an accuracy parameter \(\delta > 0\). It runs in time \(O_{d,\delta }(1)\cdot poly(n^d)\) and outputs a multilinear degree-d polynomial q over \(R^{n'}\), with \(n' \le O_{d,\delta }(1) \cdot n\), such that

$$\begin{aligned} \left| \mathbf{Pr}_{x \sim N(0,1)^n}[p(x) \ge 0] - \mathbf{Pr}_{x \sim N(0,1)^{n'}}[q(x) \ge 0] \right| \le O(\delta ). \end{aligned}$$

Proof

The procedure Multilinearize is given below. (Recall that a diagonal entry of a q-tensor \(f = \sum _{i_1,\dots ,i_q=1} f(i_1, \dots , i_q)\cdot e_{i_1} \otimes \cdots \otimes e_{i_q}\) is a coefficient \(f(i_1,\dots ,i_q)\) that has \(i_a=i_b\) for some \(a \ne b.\))

figure a

It is clear that all the tensors \(g_j\) are multilinear, so by Remark 11 the polynomial q that the procedure Linearize outputs is multilinear. The main step in proving Theorem 55 is to bound the variance of \(\tilde{q}-q\), so we will establish the following claim:

Claim 56

\(\mathbf{Var}[\tilde{q}-q] \le {\frac{d}{K}} \cdot \mathbf{Var}[\tilde{q}].\)

Proof

We first observe that

$$\begin{aligned} \mathbf{Var}[\tilde{q}-q] = \mathbf{Var}\left[ \sum _{j=1}^d (I_j(\tilde{f}_j) - I_j(g_j)) \right] = \sum _{j=1}^d \mathbf{E}\left[ (I_j(\tilde{f}_j) - I_j(g_j))^2\right] , \end{aligned}$$
(54)

where the first equality is because \(g_0=f_0\) and the second is by Claim 12 and the fact that each \(I_j(\tilde{f}_j), I_j(g_j)\) has mean 0 for \(j \ge 1.\) Now fix a value \(1 \le j \le d.\) Since each \(g_j\) is obtained from \(\tilde{f}_j\) by zeroing out diagonal elements, again using Claim 12 we see that

$$\begin{aligned} \mathbf{E}\left[ (I_j(\tilde{f}_j) - I_j(g_j))^2\right] = j! \cdot \Vert \tilde{f}_j - g_j\Vert ^2_F, \end{aligned}$$
(55)

where the squared Frobenius norm \(\Vert \tilde{f}_j - g_j\Vert ^2_F\) equals the sum of squared entries of the tensor \(\tilde{f}_j - g_j\). Now,observe that the entry \(\alpha _{i_1,\dots ,i_j}=f_j(i_1, \ldots , i_j)\) of the tensor \(f_j \) maps to the entry

$$\begin{aligned}&\alpha _{i_1, \ldots , i_j} \frac{(e_{i_1,1} + \cdots + e_{i_1, K})}{\sqrt{K}} \otimes \ldots \otimes \frac{(e_{i_j,1} + \cdots +e_{i_j, K})}{\sqrt{K}}\\&\quad = \alpha _{i_1, \ldots , i_j} \cdot \sum _{(\ell _1, \ldots , \ell _{{j}}) \in [K]^{{j}}} \frac{1}{K^{{j}/2}} \otimes _{{{a=1}}}^{{j}} e_{i_{{a}},\ell _{{a}}} \end{aligned}$$

when \(\tilde{q}\) is constructed from p. Further observe that all \(K^j\) outcomes of \(\otimes _{{{a=1}}}^{{j}} e_{i_{{a}},\ell _{{a}}}\) are distinct. Since \(g_j\) is obtained by zeroing out the diagonal entries of \(\tilde{f}_j\), we get that

$$\begin{aligned} \Vert \tilde{f}_j - g_j \Vert _F^2 = \sum _{{(i_1, \ldots , i_j) \in [n]^j}} (\alpha _{i_1, \ldots , i_j})^2 \cdot \frac{1}{K^j} \cdot |{\mathcal {S}}_{K, j}| \end{aligned}$$

where the set \({\mathcal {S}}_{K,j} = \{{(\ell _1, \ldots , \ell _j) \in [K]^j : \ell _1, \ldots , \ell _j} \text { are not all distinct}\}\). It is easy to see that \(|{\mathcal {S}}_{K, j}| \le (j^2 \cdot K^j)/{K}\), so we get

$$\begin{aligned} \Vert \tilde{f}_j - g_j \Vert _F^2 \le \sum _{{(i_1, \ldots , i_j) \in [n]^j}} (\alpha _{i_1, \ldots , i_j})^2 \cdot \frac{j^2}{{K}}. \end{aligned}$$

Returning to (54) and (55), this yields

$$\begin{aligned} \mathbf{Var}[q-\tilde{q}] \le \sum _{j=1}^d j! \cdot \sum _{{(i_1, \ldots , i_j) \in [n]^j}} (\alpha _{i_1, \ldots , i_j})^2 \cdot \frac{j^2}{{K}} \le \frac{d^2}{K} \cdot \left( \sum _{j=1}^d j! \cdot \sum _{{(i_1, \ldots , i_j) \in [n]^j}} (\alpha _{i_1, \ldots , i_j})^2\right) . \end{aligned}$$

Using Fact 13 and Claim 12, we see that

$$\begin{aligned} \mathbf{Var}[p] = \sum _{j=1}^d \mathbf{Var}[I_j(f_j)] = \sum _{j=1}^d \mathbf{E}[I_j(f_j)^2] = \sum _{j=1}^d j! \cdot \sum _{{(i_1, \ldots , i_j) \in [n]^j}} (\alpha _{i_1, \ldots , i_j})^2. \end{aligned}$$

It is easy to see that \(\mathbf{Var}[\tilde{q}]=\mathbf{Var}[p]\), which equals 1 by assumption, so we have that \(\mathbf{Var}[q-\tilde{q}]\le {\frac{d^2}{K}} \cdot \mathbf{Var}[\tilde{q}]\) as desired. \(\square \)

To finish the proof of Theorem 55, observe that by our choice of K we have \(\mathbf{Var}[q-\tilde{q}] \le (\delta /d)^{3d} \cdot \mathbf{Var}[ \tilde{q}]\). Since \(q-\tilde{q}\) has mean 0 and \(\mathbf{Var}[\tilde{q}]=1\) we may apply Lemma 6, and we get that \(|\mathbf{Pr}_{x \sim N(0,1)^{n'}}[q(x) \ge 0] - \mathbf{Pr}_{x \sim N(0,1)^{n'}}[\tilde{q}(x) \ge 0]| \le O(\delta ).\) The theorem follows by observing that the two distributions \(p(x)_{x \sim N(0,1)^n}\) and \(\tilde{q}(x)_{x \sim N(0,1)^{n'}}\) are identical. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

De, A., Servedio, R.A. A new central limit theorem and decomposition for Gaussian polynomials, with an application to deterministic approximate counting. Probab. Theory Relat. Fields 171, 981–1044 (2018). https://doi.org/10.1007/s00440-017-0804-y

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00440-017-0804-y

Keywords

Mathematics Subject Classification

Navigation