Skip to main content

Preventing Re-identification While Supporting GWAS

  • Chapter
  • First Online:
Anonymization of Electronic Medical Records to Support Clinical Analysis

Part of the book series: SpringerBriefs in Electrical and Computer Engineering ((BRIEFSELECTRIC))

  • 700 Accesses

Abstract

This chapter discusses how clinical data can be published in a way that prevents re-identification attacks, while supporting the validation of Genome-Wide Association Studies (GWAS). After motivating the problem in Sect. 4.1, we provide an overview of an approach that deals with it [5, 9] in Sects. 4.2 and 4.3. This approach extracts potentially linkable clinical features and modifies them in a way that they can no longer be used to link a genomic sequence to a small number of patients, while preserving the associations between genomic sequences and specific sets of clinical features corresponding to GWAS-related diseases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.cdc.gov/nchs/icd/icd9cm.htm

References

  1. Aggarwal, C.C.: On k-anonymity and the curse of dimensionality. In: VLDB, pp. 901–909 (2005)

    Google Scholar 

  2. Cao, J., Karras, P., Raïssi, C., Tan, K.: rho-uncertainty: Inference-proof transaction anonymization. PVLDB 3(1), 1033–1044 (2010)

    Google Scholar 

  3. Emam, K.E., Dankar, F.K.: Protecting privacy using k-anonymity. Journal of the American Medical Informatics Association 15(5), 627–637 (2008)

    Article  Google Scholar 

  4. Fung, B.C.M., Wang, K., Chen, R., Yu, P.S.: Privacy-preserving data publishing: A survey on recent developments. ACM Comput. Surv. 42 (2010)

    Google Scholar 

  5. Gkoulalas-Divanis, A., Loukides, G.: PCTA: Privacy-constrained Clustering-based Transaction Data Anonymization. In: EDBT PAIS, p. 5 (2011)

    Google Scholar 

  6. He, Y., Naughton, J.F.: Anonymization of set-valued data via top-down, local generalization. PVLDB 2(1), 934–945 (2009)

    Google Scholar 

  7. LeFevre, K., DeWitt, D., Ramakrishnan, R.: Incognito: efficient full-domain k-anonymity. In: SIGMOD, pp. 49–60 (2005)

    Google Scholar 

  8. Loukides, G., Denny, J., Malin, B.: The disclosure of diagnosis codes can breach research participants’ privacy. Journal of the American Medical Informatics Association 17, 322–327 (2010)

    Google Scholar 

  9. Loukides, G., Gkoulalas-Divanis, A., Malin, B.: Anonymization of electronic medical records for validating genome-wide association studies. Proceedings of the National Academy of Sciences 17(107), 7898–7903 (2010)

    Article  Google Scholar 

  10. Loukides, G., Gkoulalas-Divanis, A., Malin, B.: COAT: Constraint-based anonymization of transactions. KAIS 28(2), 251–282 (2011)

    Google Scholar 

  11. Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: l-diversity: Privacy beyond k-anonymity. In: ICDE, p. 24 (2006)

    Google Scholar 

  12. Manolio, T.A.: Collaborative genome-wide association studies of diverse diseases: programs of the nhgris office of population genomics. Pharmacogenomics 10(2), 235–241 (2009)

    Article  Google Scholar 

  13. National Institutes of Health: Policy for sharing of data obtained in NIH supported or conducted genome-wide association studies. NOT-OD-07-088. 2007.

    Google Scholar 

  14. Nergiz, M.E., Clifton, C.: Thoughts on k-anonymization. DKE 63(3), 622–645 (2007)

    Article  Google Scholar 

  15. Ohno-Machado, L., Vinterbo, S., Dreiseitl, S.: Effects of data anonymization by cell suppression on descriptive statistics and predictive modeling performance. Journal of American Medical Informatics Association 9(6), 115–119 (2002)

    Article  Google Scholar 

  16. Samarati, P.: Protecting respondents identities in microdata release. TKDE 13(9), 1010–1027 (2001)

    Google Scholar 

  17. Sweeney, L.: k-anonymity: a model for protecting privacy. IJUFKS 10, 557–570 (2002)

    MathSciNet  MATH  Google Scholar 

  18. Terrovitis, M., Mamoulis, N., Kalnis, P.: Privacy-preserving anonymization of set-valued data. PVLDB 1(1), 115–125 (2008)

    Google Scholar 

  19. Terrovitis, M., Mamoulis, N., Kalnis, P.: Local and global recoding methods for anonymizing set-valued data. VLDB J 20(1), 83–106 (2011)

    Article  Google Scholar 

  20. Xu, Y., Wang, K., Fu, A.W.C., Yu, P.S.: Anonymizing transaction databases for publication. In: KDD, pp. 767–775 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2013 The Author(s)

About this chapter

Cite this chapter

Gkoulalas-Divanis, A., Loukides, G. (2013). Preventing Re-identification While Supporting GWAS. In: Anonymization of Electronic Medical Records to Support Clinical Analysis. SpringerBriefs in Electrical and Computer Engineering. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-5668-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-5668-1_4

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-5667-4

  • Online ISBN: 978-1-4614-5668-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics