Skip to main content

Privacy Challenges of Genomic Big Data

  • Chapter
  • First Online:

Part of the book series: Advances in Experimental Medicine and Biology ((AEMB,volume 1028))

Abstract

With the rapid advancement of high-throughput DNA sequencing technologies, genomics has become a big data discipline where large-scale genetic information of human individuals can be obtained efficiently with low cost. However, such massive amount of personal genomic data creates tremendous challenge for privacy, especially given the emergence of direct-to-consumer (DTC) industry that provides genetic testing services. Here we review the recent development in genomic big data and its implications on privacy. We also discuss the current dilemmas and future challenges of genomic privacy.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Allain DC, Friedman S, Senter L (2012) Consumer awareness and attitudes about insurance discrimination post enactment of the Genetic Information Nondiscrimination Act. Fam Cancer 11:637–644

    Article  PubMed  Google Scholar 

  2. Ashton PM, Nair S, Dallman T, Rubino S, Rabsch W, Mwaigwisya S, Wain J, O'Grady J (2015) MinION nanopore sequencing identifies the position and structure of a bacterial antibiotic resistance island. Nat Biotechnol 33:296–300

    Article  CAS  PubMed  Google Scholar 

  3. Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL, Bignell HR et al (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456:53–59

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Cancer Genome Atlas Research N, Weinstein JN, Collisson EA, Mills GB, Shaw KR, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM (2013) The cancer genome atlas pan-cancer analysis project. Nat Genet 45:1113–1120

    Article  Google Scholar 

  5. Consortium EP (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74

    Article  Google Scholar 

  6. Contreras JL (2015) NIH’s genomic data sharing policy: timing and tradeoffs. Trends Genet 31:55–57

    Article  CAS  PubMed  Google Scholar 

  7. Erlich Y, Narayanan A (2014) Routes for breaching and protecting genetic privacy. Nat Rev Genet 15:409–421

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Genomes Project C, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, Mc Carthy S, Mc Vean GA et al (2015) A global reference for human genetic variation. Nature 526:68–74

    Article  Google Scholar 

  9. Green ED, Guyer MS (2011) Charting a course for genomic medicine from base pairs to bedside. Nature 470:204–213

    Article  CAS  PubMed  Google Scholar 

  10. Greenbaum D, Du J, Gerstein M (2008) Genomic anonymity: have we already lost it? Am J Bioeth 8:71–74

    Article  PubMed  Google Scholar 

  11. Gurwitz D, Bregman-Eschet Y (2009) Personal genomics services: whose genomes? Eur J Hum Genet 17:883–889

    Article  PubMed  PubMed Central  Google Scholar 

  12. Gymrek M, McGuire AL, Golan D, Halperin E, Erlich Y (2013) Identifying personal genomes by surname inference. Science 339:321–324

    Article  CAS  PubMed  Google Scholar 

  13. Harmanci A, Gerstein M (2016) Quantification of private information leakage from phenotype-genotype data: linking attacks. Nat Methods 13:251–256

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Homer N, Szelinger S, Redman M, Duggan D, Tembe W, Muehling J, Pearson JV, Stephan DA, Nelson SF, Craig DW (2008) Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet 4:e1000167

    Article  PubMed  PubMed Central  Google Scholar 

  15. Huang H-Y, Bashir M. 2015 Direct-to-consumer genetic testing: contextual privacy predicament. In: Proceedings of the 78th ASIS&T Annual Meeting: information science with impact: research in and for the community, p. 50. American Society for Information Science

    Google Scholar 

  16. Im HK, Gamazon ER, Nicolae DL, Cox NJ (2012) On sharing quantitative trait GWAS results in an era of multiple-omics data and the limits of genomic privacy. Am J Hum Genet 90:591–598

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Korlach J, Bjornson KP, Chaudhuri BP, Cicero RL, Flusberg BA, Gray JJ, Holden D, Saxena R, Wegener J, Turner SW (2010) Real-time DNA sequencing from single polymerase molecules. Methods Enzymol 472:431–455

    Article  CAS  PubMed  Google Scholar 

  18. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921

    Article  CAS  PubMed  Google Scholar 

  19. Lee SS, Crawley L (2009) Research 2.0: social networking and direct-to-consumer (DTC) genomics. Am J Bioeth 9:35–44

    Article  PubMed  Google Scholar 

  20. Magnus D, Cho MK, Cook-Deegan R (2009) Direct-to-consumer genetic tests: beyond medical regulation? Genome Med 1:17

    Article  PubMed  PubMed Central  Google Scholar 

  21. Maurano MT, Humbert R, Rynes E, Thurman RE, Haugen E, Wang H, Reynolds AP, Sandstrom R, Qu H, Brody J et al (2012) Systematic localization of common disease-associated variation in regulatory DNA. Science 337:1190–1195

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. McEwen JE, Boyer JT, Sun KY (2013) Evolving approaches to the ethical management of genomic data. Trends Genet 29:375–382

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Metzker ML (2010) Sequencing technologies – the next generation. Nat Rev Genet 11:31–46

    Article  CAS  PubMed  Google Scholar 

  24. Phillips PC (2008) Epistasis – the essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet 9:855–867

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci U S A74:5463–5467

    Article  Google Scholar 

  26. Schadt EE (2012) The changing privacy landscape in the era of big data. Mol Syst Biol 8:612

    Article  PubMed  PubMed Central  Google Scholar 

  27. Schadt EE, Woo S, Hao K (2012) Bayesian method to predict individual SNP genotypes from gene expression data. Nat Genet 44:603–608

    Article  CAS  PubMed  Google Scholar 

  28. Shringarpure SS, Bustamante CD (2015) Privacy risks from genomic data-sharing beacons. Am J Hum Genet 97:631–646

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Stephens ZD, Lee SY, Faghri F, Campbell RH, Zhai C, Efron MJ, Iyer R, Schatz MC, Sinha S, Robinson GE (2015) Big data: astronomical or genomical? PLoS Biol 13:e1002195

    Article  PubMed  PubMed Central  Google Scholar 

  30. Vogelstein B, Papadopoulos N, Velculescu VE, Zhou S, Diaz LA Jr, Kinzler KW (2013) Cancer genome landscapes. Science 339:1546–1558

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, Agarwala R, Ainscough R, Alexandersson M, An P et al (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420:520–562

    Article  CAS  PubMed  Google Scholar 

  32. Weinhold N, Jacobsen A, Schultz N, Sander C, Lee W (2014) Genome-wide analysis of noncoding regulatory mutations in cancer. Nat Genet 46:1160–1165

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, Klemm A, Flicek P, Manolio T, Hindorff L et al (2014) The NHGRI GWAS catalog, a curated resource of SNP-trait associations. Nucleic Acids Res 42:D1001–D1006

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Hong Shen or Jian Ma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this chapter

Cite this chapter

Shen, H., Ma, J. (2017). Privacy Challenges of Genomic Big Data. In: Shen, B. (eds) Healthcare and Big Data Management. Advances in Experimental Medicine and Biology, vol 1028. Springer, Singapore. https://doi.org/10.1007/978-981-10-6041-0_8

Download citation

Publish with us

Policies and ethics