Understanding Genomic Variations in the Context of Health and Disease: Annotation, Interpretation, and Challenges

  • Ankita Narang
  • Aniket Bhattacharya
  • Mitali Mukerji
  • Debasis Dash


An extensive variability exists in humans – no two genomes are exactly alike. To separate “functional” variants from other bystanders is a Herculean task, complicated by the contextual nature of these variations. While a large number of well-maintained repositories of variation data exist, a systematic method to obtain information from different sources and collate them coherently toward prioritization of functional variants is the dire need of the hour. We begin this chapter with a brief classification of genomic variations and discuss the factors that govern such widespread variability, methods which are in practice to study genomic variations, and the potential uses of studying them. Moreover, we provide a short description of the different resources that have cataloged variation data and discuss the studies that have meaningfully annotated variations in specific contexts. We conclude the chapter by proposing strategies for variant prioritization, including how one should go about ascertaining the functionality of non-coding variants.


Variation Genomics GWAS Functional Contextual Missing heritability eQTL 



Project funding from the Council of Scientific and Industrial Research (CMM-0016 and MLP-901), DST Inspire and D.S. Kothari Postdoctoral Fellowship to AN, and CSIR-Senior Research Fellowship to AB are duly acknowledged. We acknowledge the efforts of Uma Anwardekar for fruitful comments and discussions.


  1. Abdulla MA et al (2009) Mapping human genetic diversity in Asia. Science (New York, NY) 326:1541–1545. CrossRefGoogle Scholar
  2. Alkan C, Coe BP, Eichler EE (2011) Genome structural variation discovery and genotyping. Nat Rev Genet 12:363–376CrossRefGoogle Scholar
  3. Andrews CA (2010) Natural selection, genetic drift, and gene flow do not act in isolation in natural populations. Nat Educ Knowl 3:5Google Scholar
  4. Ardlie KG, Kruglyak L, Seielstad M (2002) Patterns of linkage disequilibrium in the human genome. Nat Rev Genet 3:299–309CrossRefGoogle Scholar
  5. Auer PL et al (2012) Imputation of exome sequence variants into population-based samples and blood-cell-trait-associated loci in African Americans: NHLBI GO exome sequencing project. Am J Hum Genet 91:794–808CrossRefGoogle Scholar
  6. Bernstein BE et al (2010) The NIH roadmap epigenomics mapping consortium. Nat Biotechnol 28:1045–1048CrossRefGoogle Scholar
  7. Biesecker LG (2010) Exome sequencing makes medical genomics a reality. Nat Genet 42:13CrossRefGoogle Scholar
  8. Biswas S, Akey JM (2006) Genomic insights into positive selection. Trends Genet 22:437–446CrossRefGoogle Scholar
  9. Björkegren JL, Kovacic JC, Dudley JT, Schadt EE (2015) Genome-wide significant loci: how important are they?: systems genetics to understand heritability of coronary artery disease and other common complex disorders. J Am Coll Cardiol 65:830–845CrossRefGoogle Scholar
  10. Bjornsson HT, Fallin MD, Feinberg AP (2004) An integrated epigenetic and genetic approach to common human disease. Trends Genet 20:350–358CrossRefGoogle Scholar
  11. Boyle AP et al (2012) Annotation of functional variation in personal genomes using RegulomeDB. Genome Res 22:1790–1797CrossRefGoogle Scholar
  12. Brown TA (2002) Genomes. Wiley-Liss, OxfordGoogle Scholar
  13. Bush WS, Moore JH (2012) Genome-wide association studies. PLoS Comput Biol 8:e1002822CrossRefGoogle Scholar
  14. Cirulli ET, Goldstein DB (2010) Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat Rev Genet 11:415–425CrossRefGoogle Scholar
  15. Clark MJ et al (2011) Performance comparison of exome DNA sequencing technologies. Nat Biotechnol 29:908–914CrossRefGoogle Scholar
  16. Consortium EP (2012a) An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74CrossRefGoogle Scholar
  17. Consortium GP (2012b) An integrated map of genetic variation from 1,092 human genomes. Nature 491:56–65CrossRefGoogle Scholar
  18. Consortium G (2015a) The genotype-tissue expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348:648–660CrossRefGoogle Scholar
  19. Consortium GP (2015b) A global reference for human genetic variation. Nature 526:68–74CrossRefGoogle Scholar
  20. Consortium UK (2015c) The UK10K project identifies rare variants in health and disease. Nature 526:82–90CrossRefGoogle Scholar
  21. Consortium GP (2010) A map of human genome variation from population-scale sequencing. Nature 467:1061–1073CrossRefGoogle Scholar
  22. Consortium IGV (2005) The Indian Genome Variation database (IGVdb): a project overview. Hum Genet 118:1–11. CrossRefGoogle Scholar
  23. Consortium IGV (2008) Genetic landscape of the people of India: a canvas for disease gene exploration. J Genet 87:3–20CrossRefGoogle Scholar
  24. Consortium TF (2014) A promoter-level mammalian expression atlas. Nature 507:462–470CrossRefGoogle Scholar
  25. Coop G et al (2009) The role of geography in human adaptation. Plos Genet 5:e1000500CrossRefGoogle Scholar
  26. De S (2011) Somatic mosaicism in healthy human tissues. Trends Genet 27:217–223CrossRefGoogle Scholar
  27. Eichler EE, Flint J, Gibson G, Kong A, Leal SM, Moore JH, Nadeau JH (2010) Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet 11:446–450CrossRefGoogle Scholar
  28. Ekblom R, Galindo J (2011) Applications of next generation sequencing in molecular ecology of non-model organisms. Heredity 107:1–15CrossRefGoogle Scholar
  29. Flannick J, Florez JC (2016) Type 2 diabetes: genetic data sharing to advance complex disease research. Nat Rev Genet 17:535CrossRefGoogle Scholar
  30. Fokkema IF, den Dunnen JT, Taschner PE (2005) LOVD: easy creation of a locus-specific sequence variation database using an “LSDB-in-a-box” approach. Hum Mutat 26:63–68CrossRefGoogle Scholar
  31. Foo J-N, Liu J-J, Tan E-K (2012) Whole-genome and whole-exome sequencing in neurological diseases. Nat Rev Neurol 8:508–517CrossRefGoogle Scholar
  32. Frazer KA, Murray SS, Schork NJ, Topol EJ (2009) Human genetic variation and its contribution to complex traits. Nat Rev Genet 10:241–251CrossRefGoogle Scholar
  33. Freed D, Stevens EL, Pevsner J (2014) Somatic mosaicism in the human genome. Genes 5:1064–1094CrossRefGoogle Scholar
  34. Ghanbari M et al (2014) A genetic variant in the seed region of miR-4513 shows pleiotropic effects on lipid and glucose homeostasis, blood pressure, and coronary artery disease. Hum Mutat 35:1524–1531CrossRefGoogle Scholar
  35. Gibbs RA et al (2003) The international HapMap project. Nature 426:789–796CrossRefGoogle Scholar
  36. Gilissen C, Hoischen A, Brunner HG, Veltman JA (2011) Unlocking Mendelian disease using exome sequencing genome. Genome Biol 11:64Google Scholar
  37. Ginsburg GS, Willard HF (2009) Genomic and personalized medicine: foundations and applications. Transl Res 154:277–287CrossRefGoogle Scholar
  38. Griffiths AJ, Miller JH, Suzuki DT, Lewontin RC, Gelbart WM (1999) Modern genetic analysis. Freeman, New YorkGoogle Scholar
  39. Hancock AM, Witonsky DB, Gordon AS, Eshel G, Pritchard JK, Coop G, Di Rienzo A (2008) Adaptations to climate in candidate genes for common metabolic disorders. Plos Genet 4:e32CrossRefGoogle Scholar
  40. Hancock AM et al (2011) Adaptations to climate-mediated selective pressures in humans. Plos Genet 7:e1001375CrossRefGoogle Scholar
  41. Hardy GH (1908) Mendelian proportions in a mixed population. Science 28:49–50CrossRefGoogle Scholar
  42. Hartl DL, Clark AG (1997) Principles of population genetics, vol 116. Sinauer associates, SunderlandGoogle Scholar
  43. He T (1994) Anecdotal, historical and critical commentaries on genetics. Genetics 136:423–426Google Scholar
  44. Hidalgo B et al (2014) Epigenome-wide association study of fasting measures of glucose, insulin, and HOMA-IR in the genetics of lipid lowering drugs and diet network study. Diabetes 63:801–807CrossRefGoogle Scholar
  45. Huang J et al (2015) Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel. Nat Commun 6:8111CrossRefGoogle Scholar
  46. Jablonski NG, Chaplin G (2000) The evolution of human skin coloration. J Hum Evol 39:57–106CrossRefGoogle Scholar
  47. Jablonski NG, Chaplin G (2010) Human skin pigmentation as an adaptation to UV radiation. Proc Natl Acad Sci 107:8962–8968CrossRefGoogle Scholar
  48. Kamberov YG et al (2013) Modeling recent human evolution in mice by expression of a selected EDAR variant. Cell 152:691–702CrossRefGoogle Scholar
  49. Kaskow BJ et al (2013) Molecular prioritization strategies to identify functional genetic variants in the cardiovascular disease-associated expression QTL Vanin-1. Eur J Hum Genet 22(5): 688–695CrossRefGoogle Scholar
  50. Kellis M et al (2014) Defining functional DNA elements in the human genome. Proc Natl Acad Sci 111:6131–6138CrossRefGoogle Scholar
  51. Kimura M (1984) The neutral theory of molecular evolution. Cambridge University Press, New YorkGoogle Scholar
  52. Kita R, Fraser HB (2016) Local adaptation of sun-exposure-dependent gene expression regulation in human skin. PLoS Genet 12:e1006382CrossRefGoogle Scholar
  53. Ko W-Y et al (2013) Identifying Darwinian selection acting on different human APOL1 variants among diverse African populations. Am J Hum Genet 93:54–66CrossRefGoogle Scholar
  54. LaFramboise T (2009) Single nucleotide polymorphism arrays: a decade of biological, computational and technological advances. Nucleic Acids Res 37:4181–4193CrossRefGoogle Scholar
  55. Laland KN, Odling-Smee J, Myles S (2010) How culture shaped the human genome: bringing genetics and the human sciences together. Nat Rev Genet 11:137–148CrossRefGoogle Scholar
  56. Lamason RL et al (2005) SLC24A5, a putative cation exchanger, affects pigmentation in zebrafish and humans. Science 310:1782–1786CrossRefGoogle Scholar
  57. Lander ES et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921CrossRefGoogle Scholar
  58. Lenormand T (2002) Gene flow and the limits to natural selection. Trends Ecol Evol 17:183–189CrossRefGoogle Scholar
  59. Li JZ et al (2008) Worldwide human relationships inferred from genome-wide patterns of variation. Science 319:1100–1104CrossRefGoogle Scholar
  60. Luca F, Kashyap S, Southard C, Zou M, Witonsky D, Di Rienzo A, Conzen SD (2009) Adaptive variation regulates the expression of the human SGK1 gene in response to stress. PLoS Genet 5:e1000489CrossRefGoogle Scholar
  61. MacArthur DG et al (2012) A systematic survey of loss-of-function variants in human protein-coding genes. Science 335:823–828CrossRefGoogle Scholar
  62. MacDonald JR, Ziman R, Yuen RK, Feuk L, Scherer SW (2014) The Database of Genomic Variants: a curated collection of structural variation in. Nucleic Acids Res 42(Database issue):D986–D992CrossRefGoogle Scholar
  63. Manolio TA et al (2009) Finding the missing heritability of complex diseases. Nature 461:747–753CrossRefGoogle Scholar
  64. Mardis ER (2013) Next-generation sequencing platforms. Annu Rev Anal Chem 6:287–303CrossRefGoogle Scholar
  65. McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JP, Hirschhorn JN (2008) Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet 9:356–369CrossRefGoogle Scholar
  66. Medvedev P, Stanciu M, Brudno M (2009) Computational methods for discovering structural variation with next-generation sequencing. Nat Methods 6:S13–S20CrossRefGoogle Scholar
  67. Mich TB, Glenwoo A (2014) The hunt for missing genes.
  68. Miller CL et al (2014) Coronary heart disease-associated variation in TCF21 disrupts a miR-224 binding site and miRNA-mediated regulation. PLoS Genet 10:e1004263CrossRefGoogle Scholar
  69. Narang A, Roy RD, Chaurasia A, Mukhopadhyay A, Mukerji M, Dash D, Indian Genome Variation Consortium (2010) IGVBrowser–a genomic variation resource from diverse Indian populations. Database: J Biol Database Curation 2010:baq022. CrossRefGoogle Scholar
  70. Ng SB et al (2010) Exome sequencing identifies the cause of a mendelian disorder. Nat Genet 42:30–35CrossRefGoogle Scholar
  71. Ngamphiw C et al (2011) PanSNPdb: the Pan-Asian SNP genotyping database. Plos One 6:e21451CrossRefGoogle Scholar
  72. Nielsen R (2005) Molecular signatures of natural selection. Annu Rev Genet 39:197–218CrossRefGoogle Scholar
  73. Nielsen R, Paul JS, Albrechtsen A, Song YS (2011) Genotype and SNP calling from next-generation sequencing data. Nat Rev Genet 12:443–451CrossRefGoogle Scholar
  74. Pattemore JA (2011) Single nucleotide polymorphism (SNP) discovery and analysis for barley genotyping.
  75. Perry GH et al (2007) Diet and the evolution of human amylase gene copy number variation. Nat Genet 39:1256–1260CrossRefGoogle Scholar
  76. Praetorius C et al (2013) A polymorphism in IRF4 affects human pigmentation through a tyrosinase-dependent MITF/TFAP2A pathway. Cell 155:1022–1033CrossRefGoogle Scholar
  77. Quillen EE, Shriver MD (2011) Unpacking human evolution to find the genetic determinants of human skin pigmentation. J Invest Dermatol 131(E1):E5–E7CrossRefGoogle Scholar
  78. Rushton MD et al (2015) Methylation quantitative trait locus analysis of osteoarthritis links epigenetics with genetic risk. Hum Mol Genet 24:7432–7444CrossRefGoogle Scholar
  79. Schaub MA, Boyle AP, Kundaje A, Batzoglou S, Snyder M (2012) Linking disease associations with regulatory information in the human genome. Genome Res 22:1748–1759CrossRefGoogle Scholar
  80. Shendure J, Ji H (2008) Next-generation DNA sequencing. Nat Biotechnol 26:1135–1145CrossRefGoogle Scholar
  81. Sherry ST, Ward M, Sirotkin K (1999) dbSNP—database for single nucleotide polymorphisms and other classes of minor genetic variation. Genome Res 9:677–679PubMedGoogle Scholar
  82. Sherry ST, Ward M-H, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29:308–311CrossRefGoogle Scholar
  83. Slatkin M (2008) Linkage disequilibrium—understanding the evolutionary past and mapping the medical future. Nat Rev Genet 9:477–485CrossRefGoogle Scholar
  84. Smigielski EM, Sirotkin K, Ward M, Sherry ST (2000) dbSNP: a database of single nucleotide polymorphisms. Nucleic Acids Res 28:352–355CrossRefGoogle Scholar
  85. Smith KR et al (2011) Reducing the exome search space for Mendelian diseases using genetic linkage analysis of exome genotypes. Genome Biol 12:R85CrossRefGoogle Scholar
  86. Spencer CC, Su Z, Donnelly P, Marchini J (2009) Designing genome-wide association studies: sample size, power, imputation, and the choice of genotyping chip. PLoS Genet 5:e1000477CrossRefGoogle Scholar
  87. Stein LD et al (2002) The generic genome browser: a building block for a model organism system database. Genome Res 12:1599–1610CrossRefGoogle Scholar
  88. Stitziel NO, Kiezun A, Sunyaev S (2011) Computational and statistical approaches to analyzing variants identified by exome sequencing. Genome Biol 12:227CrossRefGoogle Scholar
  89. Stoneking M, Krause J (2011) Learning about human population history from ancient and modern genomes. Nat Rev Genet 12:603–614CrossRefGoogle Scholar
  90. Sturm RA (2009) Molecular genetics of human pigmentation diversity. Hum Mol Genet 18:R9–R17CrossRefGoogle Scholar
  91. Syvänen A-C (2001) Accessing genetic variation: genotyping single nucleotide polymorphisms. Nat Rev Genet 2:930–942CrossRefGoogle Scholar
  92. Thorisson GA, Smith AV, Krishnan L, Stein LD (2005) The international HapMap project web site. Genome Res 15:1592–1593CrossRefGoogle Scholar
  93. Tishkoff SA et al (2001) Haplotype diversity and linkage disequilibrium at human G6PD: recent origin of alleles that confer malarial resistance. Science 293:455–462CrossRefGoogle Scholar
  94. Veltman JA, Brunner HG (2012) De novo mutations in human genetic disease. Nat Rev Genetics 13:565–575CrossRefGoogle Scholar
  95. Zhang F, Gu W, Hurles ME, Lupski JR (2009) Copy number variation in human health, disease, and evolution. Annu Rev Genomics Hum Genet 10:451–481CrossRefGoogle Scholar
  96. Zheng H-F, Rong J-J, Liu M, Han F, Zhang X-W, Richards JB, Wang L (2015a) Performance of genotype imputation for low frequency and rare variants from the 1000 genomes. PLoS One 10:e0116487CrossRefGoogle Scholar
  97. Zheng HF et al (2015b) Whole [hyphen] genome sequencing identifies EN1 as a determinant of bone density and fracture. Nature 526:112–117CrossRefGoogle Scholar
  98. Zhou S-F et al (2008) Clinical pharmacogenetics and potential application in personalized medicine. Curr Drug Metab 9:738–784CrossRefGoogle Scholar
  99. Zuk O et al (2014) Searching for missing heritability: designing rare variant association studies. Proc Natl Acad Sci 111:E455–E464CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  • Ankita Narang
    • 1
    • 2
  • Aniket Bhattacharya
    • 3
    • 4
  • Mitali Mukerji
    • 1
    • 3
    • 4
  • Debasis Dash
    • 1
    • 4
  1. 1.G.N. Ramachandran Knowledge Centre for Genome Informatics, Council of Scientific and Industrial Research – Institute of Genomics and Integrative Biology (CSIR-IGIB)DelhiIndia
  2. 2.Epigenetics Lab, Dr. B.R. Ambedkar Center for Biomedical ResearchUniversity of Delhi (North Campus)DelhiIndia
  3. 3.Genomics and Molecular Medicine, Council of Scientific and Industrial Research – Institute of Genomics and Integrative Biology (CSIR-IGIB)DelhiIndia
  4. 4.Academy of Scientific and Innovative Research (AcSIR)DelhiIndia

Personalised recommendations