Efficient Algorithms for Privately Releasing Marginals via Convex Relaxations

Dwork, Cynthia; Nikolov, Aleksandar; Talwar, Kunal

doi:10.1007/s00454-015-9678-x

Efficient Algorithms for Privately Releasing Marginals via Convex Relaxations

Published: 24 April 2015

Volume 53, pages 650–673, (2015)
Cite this article

Discrete & Computational Geometry Aims and scope Submit manuscript

Cynthia Dwork¹,
Aleksandar Nikolov² &
Kunal Talwar³

338 Accesses
15 Citations
Explore all metrics

Abstract

Differential privacy is a definition giving a strong privacy guarantee even in the presence of auxiliary information. In this work, we pursue the application of geometric techniques for achieving differential privacy, a highly promising line of work initiated by Hardt and Talwar (Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC’10, pp 705–714. ACM Press, New York, 2010). We apply these techniques to the problem of marginal release. Here, a database refers to a collection of the data of \(n\) individuals, each characterized by \(d\) binary attributes. A \(k\)-way marginal query is specified by a subset \(S\) of \(k\) attributes, together with a \(|S|\)-dimensional binary vector \(\beta \) specifying their values. The true answer to this query is a count of the number of people in the database whose attribute vector restricted to \(S\) agrees with \(\beta \). Information theoretically, the error complexity of marginal queries—how “wrong” do the answers have to be in order to preserve differential privacy—is well understood: the per-query additive error is known to be at least \(\varOmega (\min \{ \sqrt{n},d^{k/2}\})\) and at most \(\tilde{O}(\sqrt{n}d^{{\lceil k/2\rceil /4}})\). However, no polynomial time algorithm with error complexity as low as the information-theoretic upper bound is known for small \(n\). We present a polynomial time algorithm that matches the best known information-theoretic bounds when \(k=2\); more generally, by reducing to the case \(k=2\), for any distribution on marginal queries, our algorithm achieves average error at most \(\tilde{O}(\sqrt{n}d^{{\lceil k/2\rceil /4}})\), an improvement over previous work when \(k\) is small and when error \(o(n)\) is desirable. Using private boosting, we are also able to give nearly matching worst-case error bounds. Our algorithms are based on the geometric techniques of Nikolov et al. (Proceedings of the 45th Annual ACM Symposium on Theory of Computing, STOC’13, pp 351–360. ACM Press, New York, 2013), wherein a vector of “sufficiently noisy” answers is projected onto a particular convex body. We reduce the projection step, which is expensive, to a simple geometric question: given (a succinct representation of) a convex body \(K\), find a containing convex body \(L\) that one can efficiently optimize over, while keeping the Gaussian width of \(L\) small. This reduction is achieved by a careful use of the Frank–Wolfe algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Non-convex scenario optimization

Article Open access 08 April 2024

Simone Garatti & Marco C. Campi

Generalization bounds for learning under graph-dependence: a survey

Article 03 April 2024

Rui-Ray Zhang & Massih-Reza Amini

The p-Median Problem

Notes

Throughout this introduction, we will ignore the dependence on privacy parameters such as \(\varepsilon \) and \(\delta \), and use the \(\tilde{O}\) and \(\tilde{\varOmega }\) notations to hide factors logarithmic in \(d^k\).
In fact the algorithm in [32] is exactly the Gaussian noise mechanism for \(\alpha < 2^{-k}\).
This is a simple notational switch from the usual \(\{0,1\}\) vectors, which helps simplify notation.
The symmetric convex hull of a set of points \(P\) is the convex hull of \(P\) and \(-P\).
For example, there is an easy reduction from the maximum cut problem. We omit the details.
The \(p\) to \(q\) norm of a matrix \(M\) is defined as \(\Vert M\Vert _{p\mapsto q} = \max _{x:\Vert x\Vert _p = 1}{\Vert Mx\Vert _q}\).

References

Alon, N., Naor, A.: Approximating the cut-norm via Grothendieck’s inequality. In: ACM Symposium on Theory of Computing, pp. 72–80. ACM Press, New York (2004)
Barak, B., Chaudhuri, K., Dwork, C., Kale, S., McSherry, F., Talwar, K.: Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In: Libkin, L. (ed.) Proceedings of ACM PODS, pp. 273–282. ACM Press, New York (2007)
Blum, A., Ligett, K., Roth, A.: A learning theory approach to non-interactive database privacy. In: STOC ’08: Proceedings of the 40th Annual ACM Symposium on Theory of Computing, pp. 609–618. ACM Press, New York (2008)
Bun, M., Ullman, J., Vadhan, S.: Fingerprinting codes and the price of approximate differential privacy. arXiv preprint http://arxiv.org/abs/1311.3158 (2013)
Buss, S.R., Grigoriev, D., Impagliazzo, R., Pitassi, T.: Linear gaps between degrees for the polynomial calculus modulo distinct primes. J. Comput. Syst. Sci. 62(2), 267–289 (2001)
Article MATH MathSciNet Google Scholar
Chandrasekaran, K., Thaler, J., Ullman, J., Wan, A.: Faster private release of marginals on small databases. CoRR http://arxiv.org/abs/1304.3754 (2013)
Cheraghchi, M., Klivans, A., Kothari, P., Lee, H.K.: Submodular functions are noise stable. In: SODA ’12 Proceedings of the Twenty-Third Annual ACM–SIAM. Symposium on Discrete Algorithms (SODA), pp. 1586–1592 (2012)
Clarkson, K.: Coresets, sparse greedy approximation, and the Frank–Wolfe algorithm. ACM Trans. Algorithms (TALG) 6(4), 63 (2010)
MathSciNet Google Scholar
Dasgupta, S., Gupta, A.: An elementary proof of a theorem of Johnson and Lindenstrauss. Random Struct. Algorithms 22, 60–65 (2003)
Dinur, I., Nissim, K.: Revealing information while preserving privacy. In: Proceedings of 22nd ACM Symposium on Principles of Database Systems, pp. 202–210. ACM Press, New York (2003)
Dubhashi, D.P., Panconesi, A.: Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press, New York (2009)
Book MATH Google Scholar
Dwork, C., Nissim, K.: Privacy-preserving datamining on vertically partitioned databases. In: Advances in Cryptology, CRYPTO’04. Lecture Notes in Computer Science, vol. 3152, pp. 528–544. Springer, Berlin (2004)
Dwork, C., Roth, A.: The algorithmic foundations of differential privacy. Theor. Comput. Sci. 9(3–4), 211–407 (2013)
Dwork, C., Mcsherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) Theory of Cryptography. Lecture Notes in Computer Science, vol. 3876, pp. 265–284. Springer, Berlin Heidelberg (2006)
Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., Naor, M.: Our data, ourselves: privacy via distributed noise generation. In: Vaudena, S. (ed.) EUROCRYPT. Lecture Notes in Computer Science, vol. 4004, pp. 486–503. Springer, Heidelberg (2006)
Dwork, C., Naor, M., Reingold, O., Rothblum, G.N., Vadhan, S.: On the complexity of differentially private data release: efficient algorithms and hardness results. In: Proceedings of the 41st ACM Symposium on Theory of Computing, pp. 381–390. ACM Press, New York (2009)
Dwork, C., Rothblum, G.N., Vadhan, S.: Boosting and differential privacy. In: 2010 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 51–60. IEEE, Las Vegas (2010)
Frank, M., Wolfe, P.: An algorithm for quadratic programming. Nav. Res. Logist. Q. 3(1–2), 95–110 (1956)
Article MathSciNet Google Scholar
Grigoriev, D.: Linear lower bound on degrees of positivstellensatz calculus proofs for the parity. Theor. Comput. Sci. 259(1–2), 613–622 (2001)
Article MATH MathSciNet Google Scholar
Grothendieck, A.: Résumé de la théorie métrique des produits tensoriels topologiques. Bol. Soc. Mat. Sao Paulo 8(1–79), 88 (1953)
Google Scholar
Grötschel, M., Lovász, L., Schrijver, A.: The ellipsoid method and its consequences in combinatorial optimization. Combinatorica 1(2), 169–197 (1981)
Article MATH MathSciNet Google Scholar
Gupta, A., Hardt, M., Roth, A., Ullman, J.: Privately releasing conjunctions and the statistical query barrier. In: STOC, pp. 803–812. ACM Press, New York (2011)
Hardt, M., Rothblum, G.: A multiplicative weights mechanism for privacy-preserving data analysis. In: Proceedings of the 51st Foundations of Computer Science (FOCS). IEEE, Las Vegas (2010)
Hardt, M., Talwar, K.: On the geometry of differential privacy. In: Proceedings of the 42nd ACM Symposium on Theory of computing, STOC ’10, pp. 705–714. ACM Press, New York (2010)
Hardt, M., Rothblum, G.N., Servedio, R.A.: Private data release via learning thresholds. In: Proceedings of the Twenty-Third Annual ACM–SIAM Symposium on Discrete Algorithms, SODA’12, pp. 168–187. SIAM, Kyoto (2012). http://dl.acm.org/citation.cfm?id=2095116.2095131
Hardt, M., Ligett, K., McSherry, F.: A simple and practical algorithm for differentially private data release. In: NIPS, pp. 2348–2356 (2012)
Kasiviswanathan, S., Rudelson, M., Smith, A., Ullman, J.: The price of privately releasing contingency tables and the spectra of random matrices with correlated rows. In: Proceedings of the 42nd ACM symposium on Theory of computing, pp. 775–784. ACM Press, New York (2010)
Lindenstrauss, J., Pełczyński, A.: Absolutely summing operators in \(\_ \{p\}\)-spaces and their applications. Stud. Math. 29(3), 275–326 (1968)
MATH Google Scholar
Nikolov, A., Talwar, K., Zhang, L.: The geometry of differential privacy: the sparse and approximate cases. In: Proceedings of the 45th Annual ACM Symposium on Theory of Computing, STOC ’13, pp. 351–360. ACM Press, New York (2013)
O’Donnell, R., Zhou, Y.: Approximability and proof complexity. In: SODA, pp. 1537–1556. (2013)
Roth, A., Roughgarden, T.: Interactive privacy via the median mechanism. In: Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC ’10, pp. 765–774. ACM Press, New York, (2010)
Thaler, J., Ullman, J., Vadhan, S.P.: Faster algorithms for privately releasing marginals. ICALP 1, 810–821 (2012)
MathSciNet Google Scholar
Ullman, J., Vadhan, S.: PCPs and the hardness of generating private synthetic data. In: Proceedings of the 8th Conference on Theory of Cryptography, TCC 2011, Providence, RI (2011)

Download references

Acknowledgments

We would like to thank Moritz Hardt and Anupam Gupta for several helpful discussions.

Author information

Authors and Affiliations

Microsoft Research, Mountain View, CA, USA
Cynthia Dwork
Microsoft Research, Redmond, WA, USA
Aleksandar Nikolov
Google, Mountain View, CA, USA
Kunal Talwar

Authors

Cynthia Dwork
View author publications
You can also search for this author in PubMed Google Scholar
Aleksandar Nikolov
View author publications
You can also search for this author in PubMed Google Scholar
Kunal Talwar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Cynthia Dwork or Aleksandar Nikolov.

Additional information

Editors-in-Charge: Siu-Wing Cheng and Olivier Devillers

Preliminary version in Proceedings of the 30th Annual Symposium on Computational Geometry, 2014.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dwork, C., Nikolov, A. & Talwar, K. Efficient Algorithms for Privately Releasing Marginals via Convex Relaxations. Discrete Comput Geom 53, 650–673 (2015). https://doi.org/10.1007/s00454-015-9678-x

Download citation

Received: 01 August 2014
Revised: 22 January 2015
Accepted: 22 January 2015
Published: 24 April 2015
Issue Date: April 2015
DOI: https://doi.org/10.1007/s00454-015-9678-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Algorithms for Privately Releasing Marginals via Convex Relaxations

Abstract

Access this article

Similar content being viewed by others

Non-convex scenario optimization

Generalization bounds for learning under graph-dependence: a survey

The p-Median Problem

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient Algorithms for Privately Releasing Marginals via Convex Relaxations

Abstract

Access this article

Similar content being viewed by others

Non-convex scenario optimization

Generalization bounds for learning under graph-dependence: a survey

The p-Median Problem

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation