Skip to main content

Randomization Methods to Ensure Data Privacy

  • Reference work entry
  • First Online:
  • 36 Accesses

Synonyms

Perturbation techniques

Definition

Many organizations, e.g., government statistical offices and search engine companies, collect potentially sensitive information regarding individuals either to publish this data for research, or in return for useful services. While some data collection organizations, like the census, are legally required not to breach the privacy of the individuals, other data collection organizations may not be trusted to uphold privacy. Hence, if U denotes the original data containing sensitive information about a set of individuals, then an untrusted data collector or researcher should only have access to an anonymized version of the data, U*, that does not disclose the sensitive information about the individuals. A randomized anonymization algorithm R is said to be a privacy preserving randomization method if for every table T, and for every output T * = R(T), the privacy of all the sensitive information of each individual in the original data is...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

  1. Adam NR, Wortmann JC. Security-control methods for statistical databases: a comparative study. ACM Comput Surv. 1989;21(4):515–56.

    Article  Google Scholar 

  2. Agrawal R, Srikant R. Privacy preserving data mining. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2000. p. 439–50.

    Google Scholar 

  3. Agrawal S, Haritsa JR. A framework for high-accuracy privacy-preserving mining. In: Proceedings of the 21st International Conference on Data Engineering; 2005. p. 193-204.

    Google Scholar 

  4. Barak B, Chaudhuri K, Dwork C, Kale S, McSherry F, Talwar K. Privacy, accuracy and consistency too: a holistic solution to contingency table release. In: Proceedings of the 26th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2007.

    Google Scholar 

  5. Blum A, Dwork C, McSherry F, Nissim K. Practical privacy: the SuLQ framework. In: Proceedings of the 24th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2005. p. 128–38.

    Google Scholar 

  6. Dwork C, McSherry F, Nissim K, Smith A. Calibrating noise to sensitivity in private data analysis. In: Proceedings of the 3rd Theory of Cryptography Conference; 2006. p. 265–84.

    Chapter  Google Scholar 

  7. Evfimievski A, Gehrke J, Srikant R. Limiting privacy breaches in privacy preserving data mining. In: Proceedings of the 22nd ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2003. p. 211–22.

    Google Scholar 

  8. Evfimievsky A, Srikant R, Gehrke J, Agrawal R. Privacy preserving data mining of association rules. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2002. p. 217–28.

    Google Scholar 

  9. Huang Z, Du W, Chen B. Deriving private information from randomized data. In: Proceedings of the 23th ACM SIGMOD Conference on Management of Data; 2004.

    Google Scholar 

  10. Kargupta H, Datta S, Wang Q, Sivakumar K. On the privacy preserving properties of random data perturbation techniques. In: Proceedings of the 2003 IEEE International Conference on Data Mining; 2003. p. 99–106.

    Google Scholar 

  11. Kifer D, Gehrke J. Injecting utility into anonymized datasets. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2006.

    Google Scholar 

  12. Machanavajjhala A, Kifer D, Abowd J, Gehrke J, Vihuber L. Privacy: from theory to practice on the map. In: Proceedings of the 24th International Conference on Data Engineering; 2008.

    Google Scholar 

  13. On The Map (Version 2) http://lehdmap2.dsd.census.gov/.

  14. Rastogi V, Suciu D, Hong S. The boundary between privacy and utility in data publishing. Tech. rep., University of Washington; 2007.

    Google Scholar 

  15. Reiter J. Estimating risks of identification disclosure for microdata. J Am Stat Assoc. 2005;100(472):1103–13.

    Google Scholar 

  16. Rubin DB. Discussion statistical disclosure limitation. J Off Stat. 1993;9(2):461–8.

    Google Scholar 

  17. Warner SL. Randomized response: a survey technique for eliminating evasive answer bias. J Am Stat Assoc. 1965;60(309):63–9.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ashwin Machanavajjhala .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Machanavajjhala, A., Gehrke, J. (2018). Randomization Methods to Ensure Data Privacy. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_301

Download citation

Publish with us

Policies and ethics