Abstract
With the rapid advancement of high-throughput DNA sequencing technologies, genomics has become a big data discipline where large-scale genetic information of human individuals can be obtained efficiently with low cost. However, such massive amount of personal genomic data creates tremendous challenge for privacy, especially given the emergence of direct-to-consumer (DTC) industry that provides genetic testing services. Here we review the recent development in genomic big data and its implications on privacy. We also discuss the current dilemmas and future challenges of genomic privacy.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Allain DC, Friedman S, Senter L (2012) Consumer awareness and attitudes about insurance discrimination post enactment of the Genetic Information Nondiscrimination Act. Fam Cancer 11:637–644
Ashton PM, Nair S, Dallman T, Rubino S, Rabsch W, Mwaigwisya S, Wain J, O'Grady J (2015) MinION nanopore sequencing identifies the position and structure of a bacterial antibiotic resistance island. Nat Biotechnol 33:296–300
Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL, Bignell HR et al (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456:53–59
Cancer Genome Atlas Research N, Weinstein JN, Collisson EA, Mills GB, Shaw KR, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM (2013) The cancer genome atlas pan-cancer analysis project. Nat Genet 45:1113–1120
Consortium EP (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74
Contreras JL (2015) NIH’s genomic data sharing policy: timing and tradeoffs. Trends Genet 31:55–57
Erlich Y, Narayanan A (2014) Routes for breaching and protecting genetic privacy. Nat Rev Genet 15:409–421
Genomes Project C, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, Mc Carthy S, Mc Vean GA et al (2015) A global reference for human genetic variation. Nature 526:68–74
Green ED, Guyer MS (2011) Charting a course for genomic medicine from base pairs to bedside. Nature 470:204–213
Greenbaum D, Du J, Gerstein M (2008) Genomic anonymity: have we already lost it? Am J Bioeth 8:71–74
Gurwitz D, Bregman-Eschet Y (2009) Personal genomics services: whose genomes? Eur J Hum Genet 17:883–889
Gymrek M, McGuire AL, Golan D, Halperin E, Erlich Y (2013) Identifying personal genomes by surname inference. Science 339:321–324
Harmanci A, Gerstein M (2016) Quantification of private information leakage from phenotype-genotype data: linking attacks. Nat Methods 13:251–256
Homer N, Szelinger S, Redman M, Duggan D, Tembe W, Muehling J, Pearson JV, Stephan DA, Nelson SF, Craig DW (2008) Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet 4:e1000167
Huang H-Y, Bashir M. 2015 Direct-to-consumer genetic testing: contextual privacy predicament. In: Proceedings of the 78th ASIS&T Annual Meeting: information science with impact: research in and for the community, p. 50. American Society for Information Science
Im HK, Gamazon ER, Nicolae DL, Cox NJ (2012) On sharing quantitative trait GWAS results in an era of multiple-omics data and the limits of genomic privacy. Am J Hum Genet 90:591–598
Korlach J, Bjornson KP, Chaudhuri BP, Cicero RL, Flusberg BA, Gray JJ, Holden D, Saxena R, Wegener J, Turner SW (2010) Real-time DNA sequencing from single polymerase molecules. Methods Enzymol 472:431–455
Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921
Lee SS, Crawley L (2009) Research 2.0: social networking and direct-to-consumer (DTC) genomics. Am J Bioeth 9:35–44
Magnus D, Cho MK, Cook-Deegan R (2009) Direct-to-consumer genetic tests: beyond medical regulation? Genome Med 1:17
Maurano MT, Humbert R, Rynes E, Thurman RE, Haugen E, Wang H, Reynolds AP, Sandstrom R, Qu H, Brody J et al (2012) Systematic localization of common disease-associated variation in regulatory DNA. Science 337:1190–1195
McEwen JE, Boyer JT, Sun KY (2013) Evolving approaches to the ethical management of genomic data. Trends Genet 29:375–382
Metzker ML (2010) Sequencing technologies – the next generation. Nat Rev Genet 11:31–46
Phillips PC (2008) Epistasis – the essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet 9:855–867
Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci U S A74:5463–5467
Schadt EE (2012) The changing privacy landscape in the era of big data. Mol Syst Biol 8:612
Schadt EE, Woo S, Hao K (2012) Bayesian method to predict individual SNP genotypes from gene expression data. Nat Genet 44:603–608
Shringarpure SS, Bustamante CD (2015) Privacy risks from genomic data-sharing beacons. Am J Hum Genet 97:631–646
Stephens ZD, Lee SY, Faghri F, Campbell RH, Zhai C, Efron MJ, Iyer R, Schatz MC, Sinha S, Robinson GE (2015) Big data: astronomical or genomical? PLoS Biol 13:e1002195
Vogelstein B, Papadopoulos N, Velculescu VE, Zhou S, Diaz LA Jr, Kinzler KW (2013) Cancer genome landscapes. Science 339:1546–1558
Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, Agarwala R, Ainscough R, Alexandersson M, An P et al (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420:520–562
Weinhold N, Jacobsen A, Schultz N, Sander C, Lee W (2014) Genome-wide analysis of noncoding regulatory mutations in cancer. Nat Genet 46:1160–1165
Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, Klemm A, Flicek P, Manolio T, Hindorff L et al (2014) The NHGRI GWAS catalog, a curated resource of SNP-trait associations. Nucleic Acids Res 42:D1001–D1006
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Shen, H., Ma, J. (2017). Privacy Challenges of Genomic Big Data. In: Shen, B. (eds) Healthcare and Big Data Management. Advances in Experimental Medicine and Biology, vol 1028. Springer, Singapore. https://doi.org/10.1007/978-981-10-6041-0_8
Download citation
DOI: https://doi.org/10.1007/978-981-10-6041-0_8
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6040-3
Online ISBN: 978-981-10-6041-0
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)