Randomization Methods to Ensure Data Privacy

Machanavajjhala, Ashwin; Gehrke, Johannes

doi:10.1007/978-1-4614-8265-9_301

Randomization Methods to Ensure Data Privacy

Ashwin Machanavajjhala³ &
Johannes Gehrke³

Reference work entry
First Online: 01 January 2018

36 Accesses

Synonyms

Perturbation techniques

Definition

Many organizations, e.g., government statistical offices and search engine companies, collect potentially sensitive information regarding individuals either to publish this data for research, or in return for useful services. While some data collection organizations, like the census, are legally required not to breach the privacy of the individuals, other data collection organizations may not be trusted to uphold privacy. Hence, if U denotes the original data containing sensitive information about a set of individuals, then an untrusted data collector or researcher should only have access to an anonymized version of the data, U*, that does not disclose the sensitive information about the individuals. A randomized anonymization algorithm R is said to be a privacy preserving randomization method if for every table T, and for every output T * = R(T), the privacy of all the sensitive information of each individual in the original data is...

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 4,499.99; Price excludes VAT (USA)

Hardcover Book: USD 6,499.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

Adam NR, Wortmann JC. Security-control methods for statistical databases: a comparative study. ACM Comput Surv. 1989;21(4):515–56.
Article Google Scholar
Agrawal R, Srikant R. Privacy preserving data mining. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2000. p. 439–50.
Google Scholar
Agrawal S, Haritsa JR. A framework for high-accuracy privacy-preserving mining. In: Proceedings of the 21st International Conference on Data Engineering; 2005. p. 193-204.
Google Scholar
Barak B, Chaudhuri K, Dwork C, Kale S, McSherry F, Talwar K. Privacy, accuracy and consistency too: a holistic solution to contingency table release. In: Proceedings of the 26th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2007.
Google Scholar
Blum A, Dwork C, McSherry F, Nissim K. Practical privacy: the SuLQ framework. In: Proceedings of the 24th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2005. p. 128–38.
Google Scholar
Dwork C, McSherry F, Nissim K, Smith A. Calibrating noise to sensitivity in private data analysis. In: Proceedings of the 3rd Theory of Cryptography Conference; 2006. p. 265–84.
Chapter Google Scholar
Evfimievski A, Gehrke J, Srikant R. Limiting privacy breaches in privacy preserving data mining. In: Proceedings of the 22nd ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2003. p. 211–22.
Google Scholar
Evfimievsky A, Srikant R, Gehrke J, Agrawal R. Privacy preserving data mining of association rules. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2002. p. 217–28.
Google Scholar
Huang Z, Du W, Chen B. Deriving private information from randomized data. In: Proceedings of the 23th ACM SIGMOD Conference on Management of Data; 2004.
Google Scholar
Kargupta H, Datta S, Wang Q, Sivakumar K. On the privacy preserving properties of random data perturbation techniques. In: Proceedings of the 2003 IEEE International Conference on Data Mining; 2003. p. 99–106.
Google Scholar
Kifer D, Gehrke J. Injecting utility into anonymized datasets. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2006.
Google Scholar
Machanavajjhala A, Kifer D, Abowd J, Gehrke J, Vihuber L. Privacy: from theory to practice on the map. In: Proceedings of the 24th International Conference on Data Engineering; 2008.
Google Scholar
On The Map (Version 2) http://lehdmap2.dsd.census.gov/.
Rastogi V, Suciu D, Hong S. The boundary between privacy and utility in data publishing. Tech. rep., University of Washington; 2007.
Google Scholar
Reiter J. Estimating risks of identification disclosure for microdata. J Am Stat Assoc. 2005;100(472):1103–13.
Google Scholar
Rubin DB. Discussion statistical disclosure limitation. J Off Stat. 1993;9(2):461–8.
Google Scholar
Warner SL. Randomized response: a survey technique for eliminating evasive answer bias. J Am Stat Assoc. 1965;60(309):63–9.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Cornell University, Ithaca, NY, USA
Ashwin Machanavajjhala & Johannes Gehrke

Authors

Ashwin Machanavajjhala
View author publications
You can also search for this author in PubMed Google Scholar
Johannes Gehrke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ashwin Machanavajjhala .

Editor information

Editors and Affiliations

Georgia Institute of Technology College of Computing, Atlanta, GA, USA
Ling Liu
University of Waterloo School of Computer Science, Waterloo, ON, Canada
M. Tamer Özsu

Section Editor information

Dept. of Computer Science, Purdue University, West Lafayette, IN, USA
Chris Clifton

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Machanavajjhala, A., Gehrke, J. (2018). Randomization Methods to Ensure Data Privacy. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_301

Download citation

DOI: https://doi.org/10.1007/978-1-4614-8265-9_301
Published: 07 December 2018
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics