Abstract
This chapter provides an introduction to the problem of anonymizing patient data derived from Electronic Medical Record (EMR) systems. We first illustrate the need for sharing such data, in a privacy-preserving way, to support a growing number of medical applications. Subsequently, we consider patient re-identification, a threat that has led to violations of patients’ privacy. We discuss the challenges that forestalling patient re-identification entails, as well as how these challenges are addressed by current research. Last, we provide a summary of the topics that will be examined in the remainder of the book.
Keywords
- Electronic Medical Record
- Diagnosis Code
- Healthcare Organization
- Data Utility
- Electronic Medical Record System
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
National Ambulatory Medical Care Survey (NAMCS). http://www.cdc.gov/nchs/ahcd.htm.
- 2.
National Partnership for Women & Families, Making IT Meaningful: How Consumers Value and Trust Health IT Survey. http://www.nationalpartnership.org/
References
EU Data Protection Directive 95/46/ECK (1995)
UK Data Protection Act (1998)
Personal Information Protection and Electronic Documents Act (2000)
Adam, N., Worthmann, J.: Security-control methods for statistical databases: a comparative study. ACM Comput. Surv. 21(4), 515–556 (1989)
Benitez, K., Loukides, G., Malin, B.: Beyond safe harbor: automatic discovery of health information de-identification policy alternatives. In: ACM International Health Informatics Symposium, pp. 163–172 (2010)
Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical privacy: the sulq framework. In: PODS, pp. 128–138 (2005)
Dalenius, T., Reiss, S.: Data swapping: A technique for disclosure control. Journal of Statistical Planning and Inference 6, 731–785 (1982)
Dean, B., Lam, J., Natoli, J., Butler, Q., Aguilar, D., Nordyke, R.: Use of electronic medical records for health outcomes research: A literature review. Medical Care Reseach and Review 66(6), 611-638 (2010)
Diesburg, S.M., Wang, A.: A survey of confidential data storage and deletion methods. ACM Computing Surveys 43(1), 1–37 (2010)
Dwork, C.: Differential privacy. In: ICALP, pp. 1–12 (2006)
Emam, K.E.: Methods for the de-identification of electronic health records for genomic research. Genome Medicine 3(4), 25 (2011)
Emam, K.E., Dankar, F.K.: Protecting privacy using k-anonymity. Journal of the American Medical Informatics Association 15(5), 627–637 (2008)
Emam, K.E., Dankar, F.K., et al.: A globally optimal k-anonymity method for the de-identification of health data. Journal of the American Medical Informatics Association 16(5), 670–682 (2009)
Emam, K.E., Paton, D., Dankar, F., Koru, G.: De-identifying a public use microdata file from the canadian national discharge abstract database. BMC Medical Informatics and Decision Making 11, 53 (2011)
Fung, B.C.M., Wang, K., Chen, R., Yu, P.S.: Privacy-preserving data publishing: A survey on recent developments. ACM Comput. Surv. 42 (2010)
Gkoulalas-Divanis, A., Loukides, G.: PCTA: Privacy-constrained Clustering-based Transaction Data Anonymization. In: EDBT PAIS, p. 5 (2011)
Guttmacher, A.E., Collins, F.S.: Realizing the promise of genomics in biomedical research. Journal of the American Medical Association 294(11), 1399–1402 (2005)
Kullo, I., Fan, J., Pathak, J., Savova, G., Ali, Z., Chute, C.: Leveraging informatics for genetic studies: use of the electronic medical record to enable a genome-wide association study of peripheral arterial disease. Journal of the American Medical Informatics Association 17(5), 568–574 (2010)
Lau, E., Mowat, F., Kelsh, M., Legg, J., Engel-Nitz, N., Watson, H., Collins, H., Nordyke, R., Whyte, J.: Use of electronic medical records (EMR) for oncology outcomes research: assessing the comparability of EMR information to patient registry and health claims data. Clinical Epidemiology 3(1), 259–272 (2011)
LeFevre, K., DeWitt, D., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: ICDE, p. 25 (2006)
Lemke, A., Wolf, W., Hebert-Beirne, J., Smith, M.: Public and biobank participant attitudes toward genetic research participation and data sharing. Public Health Genomics 13(6), 368–377 (2010)
Loukides, G., Denny, J., Malin, B.: The disclosure of diagnosis codes can breach research participants’ privacy. Journal of the American Medical Informatics Association 17, 322–327 (2010)
Loukides, G., Gkoulalas-Divanis, A., Malin, B.: Anonymization of electronic medical records for validating genome-wide association studies. Proceedings of the National Academy of Sciences 17(107), 7898–7903 (2010)
Loukides, G., Shao, J.: Capturing data usefulness and privacy protection in k-anonymisation. In: SAC, pp. 370–374 (2007)
Loukides, G., Shao, J.: Preventing range disclosure in k-anonymised data. Expert Systems with Applications 38(4), 4559–4574 (2011)
Mailman, M., Feolo, M., Jin, Y., Kimura, M., Tryka, K., Bagoutdinov, R., et al.: The ncbi dbgap database of genotypes and phenotypes. Nature Genetics 39, 1181–1186 (2007)
Makoul, G., Curry, R.H., Tang, P.C.: The use of electronic medical records communication patterns in outpatient encounters. Journal of the American Medical Informatics Association 8(6), 610–615 (2001)
McCarty, C.A., et al.: The emerge network: A consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Medical Genomics 4, 13 (2011)
National Institutes of Health: Policy for sharing of data obtained in NIH supported or conducted genome-wide association studies. NOT-OD-07-088. 2007.
Nin, J., Herranz, J., Torra, V.: Rethinking rank swapping to decrease disclosure risk. DKE 64(1), 346–364 (2008)
Ollier, W., Sprosen, T., Peakman, T.: UK biobank: from concept to reality. Pharmacogenomics 6(6), 639–646 (2005)
Pinkas, B.: Cryptographic techniques for privacy-preserving data mining. ACM Special Interest Group on Knowledge Discovery and Data Mining Explorations 4(2), 12–19 (2002)
Reis, B.Y., Kohane, I.S., Mandl, K.D.: Longitudinal histories as predictors of future diagnoses of domestic abuse: modelling study. BMJ 339(9) (2009)
Roden, D., Pulley, J., Basford, M., Bernard, G., Clayton, E., Balser, J., Masys, D.: Development of a large scale de-identified dna biobank to enable personalized medicine. Clinical Pharmacology and Therapeutics 84(3), 362–369 (2008)
Samarati, P.: Protecting respondents identities in microdata release. TKDE 13(9), 1010–1027 (2001)
Sandhu, R.S., Coyne, E.J., Feinstein, H.L., Youman, C.E.: Role-based access control models. IEEE Computer 29(2), 38–47 (1996)
Stead, W., Bates, R., Byrd, J., Giuse, D., Miller, R., Shultz, E.: Case study: The vanderbilt university medical center information management architecture (2003)
Sweeney, L.: k-anonymity: a model for protecting privacy. IJUFKS 10, 557–570 (2002)
Tildesley, M.J., House, T.A., Bruhn, M., Curry, R., ONeil, M., Allpress, J., Smith, G., Keeling, M.: Impact of spatial clustering on disease transmission and optimal control. Proceedings of the National Academy of Sciences 107(3), 1041–1046 (2010)
U.S. Department of Health and Human Services Office for Civil Rights: HIPAA administrative simplification regulation text (2006)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 The Author(s)
About this chapter
Cite this chapter
Gkoulalas-Divanis, A., Loukides, G. (2013). Introduction. In: Anonymization of Electronic Medical Records to Support Clinical Analysis. SpringerBriefs in Electrical and Computer Engineering. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-5668-1_1
Download citation
DOI: https://doi.org/10.1007/978-1-4614-5668-1_1
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-5667-4
Online ISBN: 978-1-4614-5668-1
eBook Packages: EngineeringEngineering (R0)