Skip to main content

Calibrating Noise to Sensitivity in Private Data Analysis

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNSC,volume 3876)

Abstract

We continue a line of research initiated in [10,11]on privacy-preserving statistical databases. Consider a trusted server that holds a database of sensitive information. Given a query function f mapping databases to reals, the so-called true answer is the result of applying f to the database. To protect privacy, the true answer is perturbed by the addition of random noise generated according to a carefully chosen distribution, and this response, the true answer plus noise, is returned to the user.

Previous work focused on the case of noisy sums, in which f = ∑ i g(x i ), where x i denotes the ith row of the database and g maps database rows to [0,1]. We extend the study to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f. Roughly speaking, this is the amount that any single argument to f can change its output. The new analysis shows that for several particular applications substantially less noise is needed than was previously understood to be the case.

The first step is a very clean characterization of privacy in terms of indistinguishability of transcripts. Additionally, we obtain separation results showing the increased value of interactive sanitization mechanisms over non-interactive.

Keywords

  • Laplace Distribution
  • True Answer
  • Query Function
  • Privacy Breach
  • Semantic Security

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

The original version of this chapter was revised: The copyright line was incorrect. This has been corrected. The Erratum to this chapter is available at DOI: 10.1007/978-3-540-32732-5_32

References

  1. Adam, N.R., Wortmann, J.C.: Security-control methods for statistical databases: a comparative study. ACM Computing Surveys 25(4) (December 1989)

    Google Scholar 

  2. Agrawal, D., Aggarwal, C.C.: On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the Twentieth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM, New York (2001)

    Google Scholar 

  3. Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Chen, W., Naughton, J.F., Bernstein, P.A. (eds.) SIGMOD Conference, pp. 439–450. ACM, New York (2000)

    Google Scholar 

  4. Ben-Sasson, E., Harsha, P., Raskhodnikova, S.: Some 3cnf properties are hard to test. In: STOC, pp. 345–354. ACM, New York (2000)

    Google Scholar 

  5. Web page for the Bertinoro CS-Statistics workshop on privacy and confidentiality (July 2005), Available from, http://www.stat.cmu.edu/~hwainer

  6. Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical privacy: The sulq framework. In: PODS (2005)

    Google Scholar 

  7. Chawla, S., Dwork, C., McSherry, F., Smith, A., Wee, H.: Toward privacy in public databases. In: Theory of Cryptography Conference (TCC), pp. 363–385 (2005)

    Google Scholar 

  8. Chawla, S., Dwork, C., McSherry, F., Talwar, K.: On the utility of privacy-preserving histograms. In: 21st Conference on Uncertainty in Artificial Intelligence (UAI) (2005)

    Google Scholar 

  9. Denning, D.E.: Secure statistical databases with random sample queries. ACM Transactions on Database Systems 5(3), 291–315 (September 1980)

    CrossRef  MATH  Google Scholar 

  10. Dinur, I., Nissim, K.: Revealing information while preserving privacy. In: Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 202–210 (2003)

    Google Scholar 

  11. Dwork, C., Nissim, K.: Privacy-preserving datamining on vertically partitioned databases. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 528–544. Springer, Heidelberg (2004)

    CrossRef  Google Scholar 

  12. Evfimievski, A.V., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: Proceedings of the Twenty- Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 211–222 (2003)

    Google Scholar 

  13. Goldwasser, S., Micali, S.: Probabilistic encryption. Journal of Computer and System Sciences 28(2), 270–299 (1984)

    MathSciNet  CrossRef  MATH  Google Scholar 

  14. Roque, G.: Masking microdata with mixtures of normal distributions. University of California, Riverside (2000); Doctoral Dissertation

    Google Scholar 

  15. Sweeney, L.: k-anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)

    MathSciNet  CrossRef  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dwork, C., McSherry, F., Nissim, K., Smith, A. (2006). Calibrating Noise to Sensitivity in Private Data Analysis. In: Halevi, S., Rabin, T. (eds) Theory of Cryptography. TCC 2006. Lecture Notes in Computer Science, vol 3876. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11681878_14

Download citation

  • DOI: https://doi.org/10.1007/11681878_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-32731-8

  • Online ISBN: 978-3-540-32732-5

  • eBook Packages: Computer ScienceComputer Science (R0)