Skip to main content

Differential Privacy Theory

  • Chapter
  • First Online:
Model Selection and Error Estimation in a Nutshell

Part of the book series: Modeling and Optimization in Science and Technologies ((MOST,volume 15))

Abstract

The problem of learning from data while preserving the privacy of individual observations has a long history and spans over multiple disciplines [1,2,3]. One way to preserve privacy is to corrupt the learning procedure with noise without destroying the information that we want to extract. Differential Privacy (DP) is one of the most powerful tools in this context [3, 4].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    From now on with a little abuse of notation we will identify \(\varvec{F} = \mathscr {A}(\varvec{S})\).

References

  1. Verykios VS, Bertino E, Fovino IN, Provenza LP, Saygin Y, Theodoridis Y (2004) State-of-the-art in privacy preserving data mining. ACM Sigmod Record 33(1):50–57

    Article  Google Scholar 

  2. Greengard S (2008) Privacy matters. Commun ACM 51(9):17–18

    Article  Google Scholar 

  3. Dwork C, Roth A (2014) The algorithmic foundations of differential privacy. Found Trends Theor Comput Sci 9(3–4):1–277

    MathSciNet  MATH  Google Scholar 

  4. Dwork C (2008) Differential privacy: a survey of results. In: International conference on theory and applications of models of computation

    Google Scholar 

  5. Dwork C, Lei J (2009) Differential privacy and robust statistics. In: Annual ACM symposium on theory of computing, pp 371–380

    Google Scholar 

  6. Dwork C, Rothblum GN, Vadhan S (2010) Boosting and differential privacy. In: IEEE annual symposium on foundations of computer science

    Google Scholar 

  7. Williams O, McSherry F (2010) Probabilistic inference and differential privacy. In: Neural information processing systems

    Google Scholar 

  8. Chaudhuri K, Hsu D (2011) Sample complexity bounds for differentially private learning. In: Conference on learning theory

    Google Scholar 

  9. Lei J (2011) Differentially private m-estimators. In: Neural information processing systems

    Google Scholar 

  10. Song S, Chaudhuri K, Sarwate AD (2013) Stochastic gradient descent with differentially private updates. In: IEEE global conference on signal and information processing

    Google Scholar 

  11. Jain P, Thakurta AG (2014) (Near) dimension independent risk bounds for differentially private learning. In: International conference on machine learning

    Google Scholar 

  12. Oh S, Viswanath P (2015) The composition theorem for differential privacy. In: International conference on machine learning

    Google Scholar 

  13. Kairouz P, Oh S, Viswanath P (2015) Secure multi-party differential privacy. In: Neural information processing systems

    Google Scholar 

  14. Kusner MJ, Gardner J, Garnett R, Weinberger K (2015) Differentially private Bayesian optimization. In: International conference on machine learning

    Google Scholar 

  15. Steinke T, Ullman J (2015) Interactive fingerprinting codes and the hardness of preventing false discovery. In: Conference on learning theory

    Google Scholar 

  16. Rogers R, Vadhan S, Lim H, Gaboardi M (2016) Differentially private chi-squared hypothesis testing: goodness of fit and independence testing. In: International conference on machine learning

    Google Scholar 

  17. Friedman A, Schuster A (2010) Data mining with differential privacy. In: ACM international conference on Knowledge discovery and data mining

    Google Scholar 

  18. Jain P, Kothari P, Thakurta A (2012) Differentially private online learning. In: Conference on learning theory

    Google Scholar 

  19. Smith A, Thakurta A (2013) Differentially private feature selection via stability arguments, and the robustness of the lasso. In: Conference on learning theory

    Google Scholar 

  20. Jain P, Thakurta A (2013) Differentially private learning with kernels. In: International conference on machine learning

    Google Scholar 

  21. Chaudhuri K, Vinterbo SA (2013) A stability-based validation procedure for differentially private machine learning. In: Neural information processing systems

    Google Scholar 

  22. Chaudhuri K, Hsu DJ, Song S (2014) The large margin mechanism for differentially private maximization. In: Neural information processing systems

    Google Scholar 

  23. Blum A, Hardt M (2015) The ladder: a reliable leaderboard for machine learning competitions. In: International conference on machine learning

    Google Scholar 

  24. Wang Y, Wang YX, Singh A (2015) Differentially private subspace clustering. In: Neural information processing systems

    Google Scholar 

  25. Dwork C, Feldman V, Hardt M, Pitassi T, Reingold O, Roth A (2015) Preserving statistical validity in adaptive data analysis. In: Annual ACM symposium on theory of computing

    Google Scholar 

  26. Dwork C, Feldman V, Hardt M, Pitassi T, Reingold O, Roth A (2015) Generalization in adaptive data analysis and holdout reuse. In: Neural information processing systems

    Google Scholar 

  27. Dwork C, Feldman V, Hardt M, Pitassi T, Reingold O, Roth A (2015) The reusable holdout: preserving validity in adaptive data analysis. Science 349(6248):636–638

    Article  MathSciNet  Google Scholar 

  28. Vapnik VN (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  29. Maurer A, Pontil M (2009) Empirical bernstein bounds and sample variance penalization. arXiv preprint arXiv:0907.3740

  30. Oneto L, Ridella S, Anguita D (2017) Differential privacy and generalization: sharper bounds with applications. Pattern Recogn Lett 89:31–38

    Article  Google Scholar 

  31. Chernoff H (1952) A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann Math Stat 23(4):493–507

    Article  MathSciNet  Google Scholar 

  32. Bernstein S (1924) On a modification of chebyshev’s inequality and of the error formula of laplace. Ann Sci Inst Sav Ukraine Sect Math 1(4):38–49

    Google Scholar 

  33. Bennett G (1962) Probability inequalities for the sum of independent random variables. J Am Stat Assoc 57(297):33–45

    Article  Google Scholar 

  34. Clopper CJ, Pearson ES (1934) The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 404–413

    Article  Google Scholar 

  35. Chen X (2008) A link between binomial parameters and means of bounded random variables. arXiv preprint arXiv:0802.3946

  36. Oneto L, Anguita D, Ridella S (2016) PAC-Bayesian analysis of distribution dependent priors: tighter risk bounds and stability analysis. Pattern Recogn Lett 80:200–207

    Article  Google Scholar 

  37. Iverson KE (1962) A programming language. In: ACM spring joint computer conference

    Google Scholar 

  38. Bonferroni CE (1936) Teoria statistica delle classi e calcolo delle probabilita. Libreria internazionale Seeber

    Google Scholar 

  39. Anguita D, Ghio A, Oneto L, Ridella S (2012) In-sample and out-of-sample model selection and error estimation for support vector machines. IEEE Trans Neural Netw Learn Syst 23(9):1390–1406

    Article  Google Scholar 

  40. Langford J (2005) Tutorial on practical prediction theory for classification. J Mach Learn Res 6:273–306

    MathSciNet  MATH  Google Scholar 

  41. Oneto L, Ghio A, Ridella S, Anguita D (2015) Fully empirical and data-dependent stability-based bounds. IEEE Trans Cybern 45(9):1913–1926

    Article  Google Scholar 

  42. Hoeffding W (1963) Probability inequalities for sums of bounded random variables. J Am Stat Assoc 58(301):13–30

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luca Oneto .

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Oneto, L. (2020). Differential Privacy Theory. In: Model Selection and Error Estimation in a Nutshell. Modeling and Optimization in Science and Technologies, vol 15. Springer, Cham. https://doi.org/10.1007/978-3-030-24359-3_9

Download citation

Publish with us

Policies and ethics