Advertisement

A Best Practice Approach to Anonymization

  • Elaine MackeyEmail author
Living reference work entry

Abstract

The need for clear guidance on anonymization is becoming increasingly pressing for the research community given the move toward open research data as common practice. Most research funders take the view that publicly funded research data are a public good which should be shared as widely as possible. Thus researchers are commonly required to detail data sharing intentions at the grant application stage. What this means in practice is that researchers need to understand the data they collect and hold and under what circumstances, if at all, they can share data; anonymization is a process critical to this, but it is complex and not well understood. This chapter provides an introduction to the topic of anonymization, defining key terminology and setting out perspectives on the assessment and management of reidentification risk and on the role of anonymization in data protection. Next, the chapter outlines a principled and holistic approach to doing well-thought-out anonymization: the Anonymisation Decision-making Framework (ADF). The framework unifies the technical, legal, ethical, and policy aspects of anonymization.

Keywords

Anonymization Anonymisation Decision-making Framework Data environment Personal data General Data Protection Regulation 

References

  1. Arrington M (2006) AOL proudly releases massive amounts of user search data. TechCrunch. http://tinyurl.com/AOL-SEARCH-BREACH. Accessed 30 May 2016
  2. Atokar (2014) Riding with the stars: passenger privacy in the NYC taxicab dataset. http://tinyurl.com/NYC-TAXI-BREACH. Accessed 30 May 2016
  3. Barth-Jones D (2012) The identification of Governor William Weld’s medical information: a critical re-examination of health data identification risks and privacy protections, then and now. https://fpf.org/wp-content/uploads/The-Re-identification-of-Governor-Welds-Medical-Information-Daniel-Barth-Jones.pdf
  4. Barth-Jones D (2015) How anonymous is anonymity? Open data releases and re-identification. Data & Society. https://datasociety.net/pubs/db/Barth-Jones_slides_043015.pdf
  5. Barth-Jones D (2016) why a systems-science perspective is needed to better inform data privacy public policy, regulation and law. Brussels privacy symposium, November 2016Google Scholar
  6. CNN Money (2010) 5 data breaches: from embarrassing to deadly. http://tinyurl.com/CNN-BREACHES/. Accessed 30 May 2016]
  7. Dibben C, Elliot M, Gowans, H, Lightfoot D, Data Linkage Centres (2015) The data linkage environment. In: Harron K, Goldstein H, Dibben K (ed) Methodological Developments in Data Linkage, First Edition. Edited by Katie Harron, Harvey Goldstein and Chris Dibben. © 2016 John Wiley & Sons, Ltd. Published 2016 by John Wiley & Sons, LtdGoogle Scholar
  8. Duncan GT, Elliot MJ, Salazae-Gonzalez JJ (2011) Statistical confidentiality. Springer, New YorkCrossRefGoogle Scholar
  9. Elliot M, Mackey E (2014) The social data environment. In: O’Hara K, David SL, de Roure D, Nguyen CM-H (eds) Digital enlightenment yearbook. IOS Press, AmsterdamGoogle Scholar
  10. Elliot M, Lomax S, Mackey E, Purdam K (2010) Data environment analysis and the key variable mapping system. In: Domingo-Ferrer J, Magkos E (eds) Privacy in statistical databases. Springer, BerlinGoogle Scholar
  11. Elliot M, Smith D, Mackey E, Purdam K (2011a) Key variable mapping system II. In: Proceedings of UNECE worksession on statistical confidentiality, Tarragona, Oct 2011Google Scholar
  12. Elliot MJ, Mackey E, Purdam K (2011b) Formalizing the selection of key variables in disclosure risk assessment. In: 58th congress of the International Statistical Institute, Aug 2011, DublinGoogle Scholar
  13. Elliot M, Mackey E, O’Hara K, Tudor C (2016) The anonymisation decision-making framework. UKAN Publication, Manchester, United KingdomGoogle Scholar
  14. Elliot M, O’Hara K, Raab C, O’Keefe C, Mackey E, Dibben C, Gowans H, Purdam K, McCullagh K (2018) Functional anonymisation: personal data and the data environment. Comput Law Secur Rev 34(2):204–221CrossRefGoogle Scholar
  15. ESSNet (2007) Guidelines for the checking of output based on microdata research, Workpackage 11. Data without Borders. Project N°: 262608. Authors: Steve Bond (ONS), Maurice Brandt (Destatis), Peter-Paul de Wolf (CBS). Online at https://ec.europa.eu/eurostat/cros/content/guidelines-output-checking_en
  16. Fienburg SE, Makov UE, Sanil A (1997) A Bayesian approach to data disclosure: optimal intruder behaviour for continuous data. J Off Stat 13(1):75–89Google Scholar
  17. Gymrek M, McGuire AL, Golan D, Halperin E, Erlich Y (2013) Identifying personal genomes by surname inference. Science 339(6117):321–324.  https://doi.org/10.1126/science.1229566. [PubMed]CrossRefGoogle Scholar
  18. Hundepool A, Domingo-Ferrer J, Franconi L, Giessing S, Nordholt ES, Spicer K, DE Wolf PP (2012) Statistical disclosure control. Wiley, LondonCrossRefGoogle Scholar
  19. ICO Anonymisation: managing data protection risk code of practice 2012. https://ico.org.uk/media/1061/anonymisation-code.pdf
  20. Mackey E (2009) A framework for understanding statistical disclosure control processes. PhD thesis, The University of Manchester, ManchesterGoogle Scholar
  21. Mackey E, Elliot M (2011) End game: can game theory help us explain how a statistical disclosure might occur and play out? CCSR working paper 2011–02Google Scholar
  22. Mackey E, Elliot M (2013) Understanding the data environment. XRDS 20(1):37–39CrossRefGoogle Scholar
  23. Mackey E, Thomas I (2019) Data protection impact assessment: guidance on identification, assessment and mitigation of high risk for linked administrative data. Report for the Administrative Data Research PartnershipGoogle Scholar
  24. Mourby M, Mackey E, Elliot M, Gowans H, Wallace S, Bell J, Smith H, Aidinlis S, Kaye J (2018) Anonymous, pseudonymous or both? Implications of the GDPR for administrative data. Comput Law Secur Rev 34(2):222–233Google Scholar
  25. Ohm P (2010) Broken promises of privacy: responding to the surprising failure of anonymization. UCLA Law Rev 57(1701):1717–1723Google Scholar
  26. Open Research Data Taskforce with Michael Jubb (2017) Research data infrastructure in the UK landscape report. https://www.universitiesuk.ac.uk/policy-and-analysis/research-policy/open-science/Pages/open-research-data-task-force.aspx
  27. Open Research Data Taskforce (2018) Realising the potential. Open Research Data Taskforce final report. https://www.gov.uk/government/publications/open-research-data-task-force-final-report
  28. Rubinstein I (2016) Brussels Privacy Symposium on Identifiability: policy and practical solutions for anonymisation and pseudonymisation – framing the discussion. In: Proceedings of Brussels Privacy Symposium: identifiability: policy and practical solutions for anonymisation and pseudonymisation. Brussels, Nov 2016. https://fpf.org/wp-content/uploads/2016/11/Mackey-Elliot-and-OHara-Anonymisation-Decision-making-Framework-v1-Oct-2016.pdf
  29. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance). Online at https://eur-lex.europa.eu/legalcontent/EN/TXT/?qid=1568043180510&uri=CELEX:32016R0679
  30. Sweeney L (1997) Weaving technology and policy together to maintain confidentiality. J Law Med Ethics 25(2–3):98–110.  https://doi.org/10.1111/j.1748-720X.1997.tb01885.xCrossRefGoogle Scholar
  31. UK Data Protection Act (2018) London, The Stationery Office. Online at http://www.legislation.gov.uk/ukpga/2018/12/contents/data.pdf
  32. Willenborg L, DE Waal T (2001) Elements of disclosure control. Springer, New YorkGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Centre for Epidemiology Versus Arthritis, Faculty of Biology, Medicine and HealthUniversity of ManchesterManchesterUK

Personalised recommendations