Skip to main content
Log in

Efficient Algorithms for Privately Releasing Marginals via Convex Relaxations

  • Published:
Discrete & Computational Geometry Aims and scope Submit manuscript

Abstract

Differential privacy is a definition giving a strong privacy guarantee even in the presence of auxiliary information. In this work, we pursue the application of geometric techniques for achieving differential privacy, a highly promising line of work initiated by Hardt and Talwar (Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC’10, pp 705–714. ACM Press, New York, 2010). We apply these techniques to the problem of marginal release. Here, a database refers to a collection of the data of \(n\) individuals, each characterized by \(d\) binary attributes. A \(k\)-way marginal query is specified by a subset \(S\) of \(k\) attributes, together with a \(|S|\)-dimensional binary vector \(\beta \) specifying their values. The true answer to this query is a count of the number of people in the database whose attribute vector restricted to \(S\) agrees with \(\beta \). Information theoretically, the error complexity of marginal queries—how “wrong” do the answers have to be in order to preserve differential privacy—is well understood: the per-query additive error is known to be at least \(\varOmega (\min \{ \sqrt{n},d^{k/2}\})\) and at most \(\tilde{O}(\sqrt{n}d^{{\lceil k/2\rceil /4}})\). However, no polynomial time algorithm with error complexity as low as the information-theoretic upper bound is known for small \(n\). We present a polynomial time algorithm that matches the best known information-theoretic bounds when \(k=2\); more generally, by reducing to the case \(k=2\), for any distribution on marginal queries, our algorithm achieves average error at most \(\tilde{O}(\sqrt{n}d^{{\lceil k/2\rceil /4}})\), an improvement over previous work when \(k\) is small and when error \(o(n)\) is desirable. Using private boosting, we are also able to give nearly matching worst-case error bounds. Our algorithms are based on the geometric techniques of Nikolov et al. (Proceedings of the 45th Annual ACM Symposium on Theory of Computing, STOC’13, pp 351–360. ACM Press, New York, 2013), wherein a vector of “sufficiently noisy” answers is projected onto a particular convex body. We reduce the projection step, which is expensive, to a simple geometric question: given (a succinct representation of) a convex body \(K\), find a containing convex body \(L\) that one can efficiently optimize over, while keeping the Gaussian width of \(L\) small. This reduction is achieved by a careful use of the Frank–Wolfe algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

Notes

  1. Throughout this introduction, we will ignore the dependence on privacy parameters such as \(\varepsilon \) and \(\delta \), and use the \(\tilde{O}\) and \(\tilde{\varOmega }\) notations to hide factors logarithmic in \(d^k\).

  2. In fact the algorithm in [32] is exactly the Gaussian noise mechanism for \(\alpha < 2^{-k}\).

  3. This is a simple notational switch from the usual \(\{0,1\}\) vectors, which helps simplify notation.

  4. The symmetric convex hull of a set of points \(P\) is the convex hull of \(P\) and \(-P\).

  5. For example, there is an easy reduction from the maximum cut problem. We omit the details.

  6. The \(p\) to \(q\) norm of a matrix \(M\) is defined as \(\Vert M\Vert _{p\mapsto q} = \max _{x:\Vert x\Vert _p = 1}{\Vert Mx\Vert _q}\).

References

  1. Alon, N., Naor, A.: Approximating the cut-norm via Grothendieck’s inequality. In: ACM Symposium on Theory of Computing, pp. 72–80. ACM Press, New York (2004)

  2. Barak, B., Chaudhuri, K., Dwork, C., Kale, S., McSherry, F., Talwar, K.: Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In: Libkin, L. (ed.) Proceedings of ACM PODS, pp. 273–282. ACM Press, New York (2007)

  3. Blum, A., Ligett, K., Roth, A.: A learning theory approach to non-interactive database privacy. In: STOC ’08: Proceedings of the 40th Annual ACM Symposium on Theory of Computing, pp. 609–618. ACM Press, New York (2008)

  4. Bun, M., Ullman, J., Vadhan, S.: Fingerprinting codes and the price of approximate differential privacy. arXiv preprint http://arxiv.org/abs/1311.3158 (2013)

  5. Buss, S.R., Grigoriev, D., Impagliazzo, R., Pitassi, T.: Linear gaps between degrees for the polynomial calculus modulo distinct primes. J. Comput. Syst. Sci. 62(2), 267–289 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  6. Chandrasekaran, K., Thaler, J., Ullman, J., Wan, A.: Faster private release of marginals on small databases. CoRR http://arxiv.org/abs/1304.3754 (2013)

  7. Cheraghchi, M., Klivans, A., Kothari, P., Lee, H.K.: Submodular functions are noise stable. In: SODA ’12 Proceedings of the Twenty-Third Annual ACM–SIAM. Symposium on Discrete Algorithms (SODA), pp. 1586–1592 (2012)

  8. Clarkson, K.: Coresets, sparse greedy approximation, and the Frank–Wolfe algorithm. ACM Trans. Algorithms (TALG) 6(4), 63 (2010)

    MathSciNet  Google Scholar 

  9. Dasgupta, S., Gupta, A.: An elementary proof of a theorem of Johnson and Lindenstrauss. Random Struct. Algorithms 22, 60–65 (2003)

  10. Dinur, I., Nissim, K.: Revealing information while preserving privacy. In: Proceedings of 22nd ACM Symposium on Principles of Database Systems, pp. 202–210. ACM Press, New York (2003)

  11. Dubhashi, D.P., Panconesi, A.: Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press, New York (2009)

    Book  MATH  Google Scholar 

  12. Dwork, C., Nissim, K.: Privacy-preserving datamining on vertically partitioned databases. In: Advances in Cryptology, CRYPTO’04. Lecture Notes in Computer Science, vol. 3152, pp. 528–544. Springer, Berlin (2004)

  13. Dwork, C., Roth, A.: The algorithmic foundations of differential privacy. Theor. Comput. Sci. 9(3–4), 211–407 (2013)

  14. Dwork, C., Mcsherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) Theory of Cryptography. Lecture Notes in Computer Science, vol. 3876, pp. 265–284. Springer, Berlin Heidelberg (2006)

  15. Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., Naor, M.: Our data, ourselves: privacy via distributed noise generation. In: Vaudena, S. (ed.) EUROCRYPT. Lecture Notes in Computer Science, vol. 4004, pp. 486–503. Springer, Heidelberg (2006)

  16. Dwork, C., Naor, M., Reingold, O., Rothblum, G.N., Vadhan, S.: On the complexity of differentially private data release: efficient algorithms and hardness results. In: Proceedings of the 41st ACM Symposium on Theory of Computing, pp. 381–390. ACM Press, New York (2009)

  17. Dwork, C., Rothblum, G.N., Vadhan, S.: Boosting and differential privacy. In: 2010 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 51–60. IEEE, Las Vegas (2010)

  18. Frank, M., Wolfe, P.: An algorithm for quadratic programming. Nav. Res. Logist. Q. 3(1–2), 95–110 (1956)

    Article  MathSciNet  Google Scholar 

  19. Grigoriev, D.: Linear lower bound on degrees of positivstellensatz calculus proofs for the parity. Theor. Comput. Sci. 259(1–2), 613–622 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  20. Grothendieck, A.: Résumé de la théorie métrique des produits tensoriels topologiques. Bol. Soc. Mat. Sao Paulo 8(1–79), 88 (1953)

    Google Scholar 

  21. Grötschel, M., Lovász, L., Schrijver, A.: The ellipsoid method and its consequences in combinatorial optimization. Combinatorica 1(2), 169–197 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  22. Gupta, A., Hardt, M., Roth, A., Ullman, J.: Privately releasing conjunctions and the statistical query barrier. In: STOC, pp. 803–812. ACM Press, New York (2011)

  23. Hardt, M., Rothblum, G.: A multiplicative weights mechanism for privacy-preserving data analysis. In: Proceedings of the 51st Foundations of Computer Science (FOCS). IEEE, Las Vegas (2010)

  24. Hardt, M., Talwar, K.: On the geometry of differential privacy. In: Proceedings of the 42nd ACM Symposium on Theory of computing, STOC ’10, pp. 705–714. ACM Press, New York (2010)

  25. Hardt, M., Rothblum, G.N., Servedio, R.A.: Private data release via learning thresholds. In: Proceedings of the Twenty-Third Annual ACM–SIAM Symposium on Discrete Algorithms, SODA’12, pp. 168–187. SIAM, Kyoto (2012). http://dl.acm.org/citation.cfm?id=2095116.2095131

  26. Hardt, M., Ligett, K., McSherry, F.: A simple and practical algorithm for differentially private data release. In: NIPS, pp. 2348–2356 (2012)

  27. Kasiviswanathan, S., Rudelson, M., Smith, A., Ullman, J.: The price of privately releasing contingency tables and the spectra of random matrices with correlated rows. In: Proceedings of the 42nd ACM symposium on Theory of computing, pp. 775–784. ACM Press, New York (2010)

  28. Lindenstrauss, J., Pełczyński, A.: Absolutely summing operators in \(\_ \{p\}\)-spaces and their applications. Stud. Math. 29(3), 275–326 (1968)

    MATH  Google Scholar 

  29. Nikolov, A., Talwar, K., Zhang, L.: The geometry of differential privacy: the sparse and approximate cases. In: Proceedings of the 45th Annual ACM Symposium on Theory of Computing, STOC ’13, pp. 351–360. ACM Press, New York (2013)

  30. O’Donnell, R., Zhou, Y.: Approximability and proof complexity. In: SODA, pp. 1537–1556. (2013)

  31. Roth, A., Roughgarden, T.: Interactive privacy via the median mechanism. In: Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC ’10, pp. 765–774. ACM Press, New York, (2010)

  32. Thaler, J., Ullman, J., Vadhan, S.P.: Faster algorithms for privately releasing marginals. ICALP 1, 810–821 (2012)

    MathSciNet  Google Scholar 

  33. Ullman, J., Vadhan, S.: PCPs and the hardness of generating private synthetic data. In: Proceedings of the 8th Conference on Theory of Cryptography, TCC 2011, Providence, RI (2011)

Download references

Acknowledgments

We would like to thank Moritz Hardt and Anupam Gupta for several helpful discussions.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Cynthia Dwork or Aleksandar Nikolov.

Additional information

Editors-in-Charge: Siu-Wing Cheng and Olivier Devillers

Preliminary version in Proceedings of the 30th Annual Symposium on Computational Geometry, 2014.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dwork, C., Nikolov, A. & Talwar, K. Efficient Algorithms for Privately Releasing Marginals via Convex Relaxations. Discrete Comput Geom 53, 650–673 (2015). https://doi.org/10.1007/s00454-015-9678-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00454-015-9678-x

Keywords

Navigation