Skip to main content
Log in

Improving privacy preservation policy in the modern information age

  • Original Paper
  • Published:
Health and Technology Aims and scope Submit manuscript

Abstract

Anonymization or de-identification techniques are methods for protecting the privacy of human subjects in sensitive data sets while preserving the utility of those data sets. In the case of health data, anonymization techniques may be used to remove or mask patient identities while allowing the health data content to be used by the medical and pharmaceutical research community. The efficacy of anonymization methods has come under repeated attacks and several researchers have shown that anonymized data can be re-identified to reveal the identity of the data subjects via approaches such as “linking.” Nevertheless, even given these deficiencies, many government privacy policies depend on anonymization techniques as the primary approach to preserving privacy. In this report, we survey the anonymization landscape and consider the range of anonymization approaches that can be used to de-identify data containing personally identifiable information. We then review several notable government privacy policies that leverage anonymization. In particular, we review the European Union’s General Data Protection Regulation (GDPR) and show that it takes a more goal-oriented approach to data privacy. It defines data privacy in terms of desired outcome (i.e., as a defense against risk of personal data disclosure), and is agnostic to the actual method of privacy preservation. And GDPR goes further to frame its privacy preservation regulations relative to the state of the art, the cost of implementation, the incurred risks, and the context of data processing. This has potential implications for the GDPR’s robustness to future technological innovations – very much in contrast to privacy regulations that depend explicitly on more definite technical specifications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

Notes

  1. Or “natural persons”, the terms that GDPR uses for individuals.

  2. The definition of what is sensitive depends on personal opinions and tastes, though many would agree that certain attributes would universally be considered sensitive.

  3. http://www.hhs.gov/hipaa/for-professionals/privacy/special-topics/de-identification/index.html#standard

  4. The statement of this policy and links to archived PIAs can be found at: (http://www.census.gov/about/policies/privacy/pia.html). The PIAs also serve to record information-sharing partners (usually other federal agencies) and consent collection practices.

  5. The Census data releases on languages spoken at home and English-speaking ability demonstrate this approach (https://www.census.gov/data/tables/2013/demo/2009-2013-lang-tables.html). These tables withhold state-level statistics on low-use languages like Welsh or Papia Mentae.

  6. See, for example, the Census Bureau’s online data visualization map: http://onthemap.ces.census.gov/

  7. http://www.nasbe.org/wp-content/uploads/2015-Federal-Education-Data-Privacy-Bills-Comparison-2015.07.22-Public.pdf

  8. Zarsky, T.Z., 2016. Incompatible: The GDPR in the Age of Big Data. Seton Hall L. Rev., 47, p.995.

  9. Goodman, B. and Flaxman, S., 2016. European Union regulations on algorithmic decision-making and a” right to explanation”. arXiv preprint arXiv:1606.08813.

  10. Article 29 Working Party, Opinion 05/2014 on Anonymization Techniques, WP216, http://ec.europa.eu/justice/data-protection/article-29/documentation/opinion- recommendation/files/2014/wp216_en.pdf

  11. edX is a provider of university-level massive open online courses, http://www.edx.org

  12. https://ask.census.gov/faq.php?id=5000&faqId=665

  13. http://psurdc.psu.edu/content/applying-special-sworn-status

  14. https://en.wikipedia.org/wiki/Classified_information

  15. http://www.regblog.org/2012/05/08/the-performance-of-performance-standards/

References

  1. Tanner, Adam, Our Bodies, Our Data: How Companies Make Billions Selling Our Medical Records, Beacon Press. 2017.

  2. G Cormode, D Srivastava. Anonymized Data: Generation, Models, Usage. SIGMOD, Providence, Rhode Island. 2009.

  3. Dalenius T. Finding a needle in a haystack: identifying anonymous census records. J Off Stat. 1986;2(3):329–36.

    Google Scholar 

  4. L Sweeney. Uniqueness of Simple Demographics in the U.S. Population , LIDAPWP4. Carnegie Mellon University, Laboratory for International Data Privacy, Pittsburgh, PA. Forthcoming book entitled, The Identifiability of Data. 2000.

  5. de Montjoye Y-A, Radaelli L, Singh VK. Unique in the shopping mall: on the reidentifiability of credit card metadata. Science. 2015;347(6221):536–9.

    Article  Google Scholar 

  6. Adam Tanner. Harvard Professor Re-Identifies Anonymous Volunteers In DNA Study,” Forbes, April 25, http://www.forbes.com/sites/adamtanner/2013/04/25/harvard-professor-re-identifies-anonymous-volunteers-in-dna-study/print/. 2013.

  7. Michael Barbaro and Tom Zeller. A Face Is Exposed for AOL Searcher No. 4417749, New York Times. 2006.

  8. A. Narayanan, V. Shmatikov. Robust De-anonymization of Large Sparse Datasets. In Proceedings of the 2008 IEEE Symposium on Security and Privacy (SP '08). IEEE Computer Society, Washington, 2008, 111–125..

  9. Clifton C, Tassa T. On syntactic anonymity and differential privacy. Trans Data Privacy. 2013;6(2):161–83.

    MathSciNet  Google Scholar 

  10. Sweeney L. K-anonymity: a model for protecting privacy. Int J Uncertainty, Fuzziness Knowledge-Based Syst. 2002;10(05):557–70.

    Article  MathSciNet  MATH  Google Scholar 

  11. Dondi R, Mauri G, Zoppis I. The l-diversity problem: tractability and approximability. Theor Comput Sci. 2013;511:159–71.

    Article  MathSciNet  MATH  Google Scholar 

  12. Truta TM, Campan A, Meyer P. Generating microdata with p-sensitive k-anonymity property. Berlin: Springer; 2007. p. 124–41.

    Google Scholar 

  13. Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M. L-diversity: privacy beyond k-anonymity. ACM Trans Knowled Discov Data (TKDD). 2007;1(1):3.

    Article  Google Scholar 

  14. Li, Ninghui, Tiancheng Li, and Suresh Venkatasubramanian. t-closeness: Privacy beyond k-anonymity and l-diversity. Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on. IEEE, 2007.

  15. Domingo-Ferrer, Josep, and Vicenç Torra. "A critique of k-anonymity and some of its enhancements." Availability, Reliability and Security, 2008. ARES 08. Third International Conference on. IEEE, 2008.

  16. Bonizzoni, P, Gianluca Della Vedova, and Riccardo Dondi. "The k-anonymity problem is hard." Fundamentals of Computation Theory. Springer Berlin Heidelberg, 2009.

  17. LeFevre, Kristen, David J. DeWitt, and Raghu Ramakrishnan. "Incognito: Efficient full-domain k-anonymity." Proceedings of the 2005 ACM SIGMOD international conference on Management of data. ACM, 2005.

  18. LeFevre, Kristen, David J. DeWitt, and Raghu Ramakrishnan. "Mondrian multidimensional k-anonymity." Data Engineering, 2006. ICDE'06. Proceedings of the 22nd International Conference on. IEEE, 2006.

  19. Liang, H, H Yuan. "On the complexity of t-closeness anonymization and related problems." In Database Systems for Advanced Applications, pp. 331-345. Springer Berlin Heidelberg, 2013.

  20. Cao J, et al. SABRE: a sensitive attribute Bucketization and REdistribution framework for t-closeness. VLDB J. 2011;20(1):59–81.

    Article  Google Scholar 

  21. Cynthia Dwork. Differential privacy: a survey of results. In Proceedings of the 5th international conference on Theory and applications of models of computation (TAMC'08), Manindra Agrawal, Dingzhu Du, Zhenhua Duan, and Angsheng Li (Eds.). Springer-Verlag, Berlin, Heidelberg, 2008, 1-19.

  22. Dwork C. An ad omnia approach to defining and achieving private data analysis. In: Bonchi F, Ferrari E, Malin B, Saygin Y, editors. Proceedings of the 1st ACM SIGKDD international conference on privacy, security, and trust in KDD (PinKDD'07). Berlin, Heidelberg: Springer-Verlag; 2007. p. 1–13.

    Google Scholar 

  23. Frank McSherry and Kunal Talwar. 2007. Mechanism Design via Differential Privacy. In Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS '07). IEEE Computer Society, Washington, DC, USA, 94-103.

  24. Soria-Comas, Jordi, and Josep Domingo-Ferrer. Differential privacy via t-closeness in data publishing. Privacy, Security and Trust (PST), 2013 Eleventh Annual International Conference on. IEEE, 2013.

  25. Sarathy R, Muralidhar K. Evaluating Laplace noise addition to satisfy differential privacy for numeric data. Trans Data Privacy. 2011;4(1):1–17.

    MathSciNet  Google Scholar 

  26. Leoni, D. (2012), Non-interactive differential privacy: a survey., in Guillaume Raschia & Martin Theobald, ed., 'WOD' , ACM, , pp. 40-52.

  27. G. Cormode, M. Procopiuc, D. Srivastava, and T. Tran. Differentially private publication of sparse data. In International Conference on Database Theory (ICDT), 2012.

  28. Ohm P. Broken promises of privacy: responding to the surprising failure of anonymization. UCLA Law Rev. 2010;57:1701.

    Google Scholar 

  29. Zayatz L. Disclosure avoidance practices and research at the US Census Bureau: an update. J Off Stat. 2007;23(2):253.

    Google Scholar 

  30. Klarreich, Erica. "Privacy by the Numbers: A New Approach to Safeguarding Data." Quanta Magazine. Quanta Magazine, 2012. Web. <https://www.quantamagazine.org/20121210-privacy-by-the-numbers-a-new-approach-to-safeguarding-data/>.

  31. Chawla S, et al. Toward privacy in public databases. Berlin: Theory of Cryptography Springer; 2005. p. 363–85.

    MATH  Google Scholar 

  32. Roscorla, Tanya. “3 Student Data Privacy Bills That Congress Could Act On.” Center for Digital Education March 24, 2016, http://www.centerdigitaled.com/k-12/3-Student-Data-Privacy-Bills-That-Congress-Could-Act-On.html

  33. Daries JP, Reich J, Waldo J, Young EM, Whittinghill J, Seaton DT, et al. Privacy, anonymity, and big data in the social sciences. Queue. 2014;12(7):30. 12 pages

    Google Scholar 

  34. Access to Classified Information, Executive Order #12968, August 4, 1995, http://www.fas.org/sgp/clinton/eo12968.html

  35. Dana Priest and William M. Arkin, “A hidden world, growing beyond control,” Washington Post – Top Secret America, http://projects.washingtonpost.com/top-secret-america/

  36. “White House orders review of 5 million security clearances,” Nov 22, 2013, https://www.rt.com/usa/clapper-demands-security-clearance-review-173/

  37. Gentry C, Halevi S. Implementing Gentry's fully-homomorphic encryption scheme. In: Paterson KG, editor. Proceedings of the 30th annual international conference on theory and applications of cryptographic techniques: advances in cryptology (EUROCRYPT'11). Berlin: Springer-Verlag; 2011. p. 129–48.

    Google Scholar 

  38. Lindell Y, Pinkas B. Privacy preserving data mining. In: Bellare M, editor. Proceedings of the 20th annual international cryptology conference on advances in cryptology (CRYPTO '00). London: Springer-Verlag; 2000. p. 36–54.

    Google Scholar 

Download references

Acknowledgements

We would like to thank Marjory Blumenthal and Rebecca Balebako for their detailed and thoughtful review of early drafts of this document. We are immensely grateful for their comments and feedback. Any errors contained herein are our own and should not be attributed to them.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John S. Davis II.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Regulation, G.D.P., 2016. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46. Official Journal of the European Union (OJ), 59, pp.1-88.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Davis, J.S., Osoba, O. Improving privacy preservation policy in the modern information age. Health Technol. 9, 65–75 (2019). https://doi.org/10.1007/s12553-018-0250-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12553-018-0250-6

Keywords

Navigation