Skip to main content

Replicating Sequencing-Based Association Studies of Rare Variants

  • Chapter
  • 936 Accesses

Abstract

Large-scale sequence-based association analysis is a powerful approach to identify rare variants involved in complex trait etiologies. Confirmation of significant findings in stage 1 through replication in an independent stage 2 sample is necessary to avoid reporting spurious results. For gene-based mapping of rare variants, where rare variants within a region are analyzed in aggregate, three replication strategies are possible: (1) variant-based replication, wherein only variants from nucleotide sites uncovered in stage 1 within the gene region are genotyped and followed up; (2) sequence-based replication, wherein the gene region is sequenced in the replication sample and both known and novel variants are tested; and (3) exome-array-based replication, where the identified gene region in the stage 1 sample is followed up using exome arrays in the stage 2 sample. The efficiency of the three strategies is dependent on the proportions of causative variants discovered in stage 1, sequencing/genotyping errors, trait-specific genetic architecture, as well as how many variants within the identified gene region are available for genotyping on the exome array. With rigorous population genetic and phenotypic models, it is demonstrated that sequence-based replication is consistently more powerful than variant- and exome-array-based replication, although the power gain can be small. For variant-based replication, if the stage 1 sample consists of several thousands of individuals, a large fraction of causative variant sites can be observed, and even for smaller stage 1 studies, a large proportion of the locus population attributable risk can be explained by the uncovered variants. Exome-array-based replication can have comparable power to the other two approaches if coding variants driving the association are well represented. As a consequence, although sequence-based replication is usually more powerful and also valuable to identify novel potentially causal variants, both variant- and exome-array-based replication can be a viable and cost-effective approach for replicating rare variant associations.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Bodmer W, Bonilla C (2008) Common and rare variants in multifactorial susceptibility to common diseases. Nat Genet 40(6):695–701

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Boyko AR, Williamson SH, Indap AR, Degenhardt JD, Hernandez RD, Lohmueller KE, Adams MD, Schmidt S, Sninsky JJ, Sunyaev SR et al (2008) Assessing the evolutionary impact of amino acid mutations in the human genome. PLoS Genet 4(5):e1000083

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Browning JD, Szczepaniak LS, Dobbins R, Nuremberg P, Horton JD, Cohen JC, Grundy SM, Hobbs HH (2004) Prevalence of hepatic steatosis in an urban population in the United States: impact of ethnicity. Hepatology 40(6):1387–1395

    Article  PubMed  Google Scholar 

  • Cohen JC, Kiss RS, Pertsemlidis A, Marcel YL, McPherson R, Hobbs HH (2004) Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305(5685):869–872

    Article  CAS  PubMed  Google Scholar 

  • Cohen JC, Pertsemlidis A, Fahmi S, Esmail S, Vega GL, Grundy SM, Hobbs HH (2006) Multiple rare variants in NPC1L1 associated with reduced sterol absorption and plasma low-density lipoprotein levels. Proc Natl Acad Sci U S A 103(6):1810–1815

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Fu W, O’Connor TD, Jun G, Kang HM, Abecasis G, Leal SM, Gabriel S, Rieder MJ, Altshuler D, Shendure J et al (2013) Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature 493(7431):216–220

    Article  CAS  PubMed  Google Scholar 

  • Harismendy O, Ng PC, Strausberg RL, Wang X, Stockwell TB, Beeson KY, Schork NJ, Murray SS, Topol EJ, Levy S et al (2009) Evaluation of next generation sequencing platforms for population targeted sequencing studies. Genome Biol 10(3):R32

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Huyghe JR, Jackson AU, Fogarty MP, Buchkovich ML, Stancakova A, Stringham HM, Sim X, Yang L, Fuchsberger C, Cederberg H et al (2013) Exome array analysis identifies new loci and low-frequency variants influencing insulin processing and secretion. Nat Genet 45(2):197–201

    Article  CAS  PubMed  Google Scholar 

  • Ji W, Foo JN, O’Roak BJ, Zhao H, Larson MG, Simon DB, Newton-Cheh C, State MW, Levy D, Lifton RP (2008) Rare independent mutations in renal salt handling genes contribute to blood pressure variation. Nat Genet 40(5):592–599

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kerem B, Chiba-Falek O, Kerem E (1997) Cystic fibrosis in Jews: frequency and mutation distribution. Genet Test 1(1):35–39

    Article  PubMed  CAS  Google Scholar 

  • King MC, Rowell S, Love SM (1993) Inherited breast and ovarian cancer. What are the risks? What are the choices? JAMA 269(15):1975–1980

    Article  CAS  PubMed  Google Scholar 

  • Kryukov GV, Shpunt A, Stamatoyannopoulos JA, Sunyaev SR (2009) Power of deep, all-exon resequencing for discovery of human trait genes. Proc Natl Acad Sci U S A 106(10):3871–3876

    Article  PubMed  PubMed Central  Google Scholar 

  • Li B, Leal SM (2008) Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet 83(3):311–321

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Li B, Wang G, Leal SM (2012) SimRare: a program to generate and analyze sequence-based data for association studies of quantitative and qualitative traits. Bioinformatics 28(20):2703–2704

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Liu DJ, Leal SM (2010a) A novel adaptive method for the analysis of next-generation sequencing data to detect complex trait associations with rare variants due to gene main effects and interactions. PLoS Genet 6(10):e1001156

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Liu DJ, Leal SM (2010b) Replication strategies for rare variant complex trait association studies via next-generation sequencing. Am J Hum Genet 87(6):790–801

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Madsen BE, Browning SR (2009) A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet 5(2):e1000384

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Morris AP, Zeggini E (2010) An evaluation of statistical approaches to rare variant analysis in genetic association studies. Genet Epidemiol 34(2):188–193

    Article  PubMed  Google Scholar 

  • Nielsen R, Hellmann I, Hubisz M, Bustamante C, Clark AG (2007) Recent and ongoing selection in the human genome. Nat Rev Genet 8(11):857–868

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Pritchard JK (2001) Are rare variants responsible for susceptibility to complex diseases? Am J Hum Genet 69(1):124–137

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Romeo S, Pennacchio LA, Fu Y, Boerwinkle E, Tybjaerg-Hansen A, Hobbs HH, Cohen JC (2007) Population-based resequencing of ANGPTL4 uncovers variations that reduce triglycerides and increase HDL. Nat Genet 39(4):513–516

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Romeo S, Yin W, Kozlitina J, Pennacchio LA, Boerwinkle E, Hobbs HH, Cohen JC (2009) Rare loss-of-function mutations in ANGPTL family members contribute to plasma triglyceride levels in humans. J Clin Invest 119(1):70–79

    PubMed  CAS  Google Scholar 

  • Tennessen JA, Bigham AW, O’Connor TD, Fu W, Kenny EE, Gravel S, McGee S, Do R, Liu X, Jun G et al (2012) Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 337(6090):64–69

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Victor RG, Haley RW, Willett DL, Peshock RM, Vaeth PC, Leonard D, Basit M, Cooper RS, Iannacchione VG, Visscher WA et al (2004) The Dallas Heart Study: a population-based probability sample for the multidisciplinary study of ethnic differences in cardiovascular health. Am J Cardiol 93(12):1473–1480

    Article  PubMed  Google Scholar 

  • Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X (2011) Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet 89(1):82–93

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgments

This work was supported by National Institutes of Health grants HL102926 and MD005964.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Suzanne M. Leal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer Science+Business Media New York

About this chapter

Cite this chapter

Liu, D.J., Leal, S.M. (2015). Replicating Sequencing-Based Association Studies of Rare Variants. In: Zeggini, E., Morris, A. (eds) Assessing Rare Variation in Complex Traits. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2824-8_14

Download citation

Publish with us

Policies and ethics