Skip to main content

Improving Imputation Accuracy by Inferring Causal Variants in Genetic Studies

  • Conference paper
  • First Online:
Research in Computational Molecular Biology (RECOMB 2017)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 10229))

  • 1987 Accesses

Abstract

Genotype imputation has been widely utilized for two reasons in the analysis of Genome-Wide Association Studies (GWAS). One reason is to increase the power for association studies when causal SNPs are not collected in the GWAS. The second reason is to aid the interpretation of a GWAS result by predicting the association statistics at untyped variants. In this paper, we show that prediction of association statistics at untyped variants that have an influence on the trait produces overly conservative results. Current imputation methods assume that none of the variants in a region (locus consists of multiple variants) affect the trait, which is often inconsistent with the observed data. In this paper, we propose a new method, CAUSAL-Imp, which can impute the association statistics at untyped variants while taking into account variants in the region that may affect the trait. Our method builds on recent methods that impute the marginal statistics for GWAS by utilizing the fact that marginal statistics follow a multivariate normal distribution. We utilize both simulated and real data sets to assess the performance of our method. We show that traditional imputation approaches underestimate the association statistics for variants involved in the trait, and our results demonstrate that our approach provides less biased estimates of these association statistics.

Y. Wu and F. Hormozdiari—These authors contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Zeggini, E., Weedon, M.N., Lindgren, C.M., et al.: Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science 316(5829), 1336–1341 (2007)

    Article  Google Scholar 

  2. Sladek, R., Rocheleau, G., Rung, J., et al.: A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 445(7130), 881–885 (2007)

    Article  Google Scholar 

  3. Hakonarson, H., Grant, S.F.A., Bradfield, J.P., et al.: A genome-wide association study identifies kiaa0350 as a type 1 diabetes gene. Nature 448(7153), 591–594 (2007)

    Article  Google Scholar 

  4. Yang, J., Manolio, T.A., Pasquale, L.R., et al.: Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 43(6), 519–525 (2011)

    Article  Google Scholar 

  5. Kottgen, A., Albrecht, E., Teumer, A., et al.: Genome-wide association analyses identify 18 new loci associated with serum urate concentrations. Nat. Genet. 45(2), 145–154 (2013)

    Article  Google Scholar 

  6. Yi, L., Vitart, V., Burdon, K.P., et al.: Genome-wide association analyses identify multiple loci associated with central corneal thickness and keratoconus. Nat. Genet. 45(2), 155–163 (2013)

    Article  Google Scholar 

  7. Ripke, S., O’Dushlaine, C., Chambert, K., et al.: Genome-wide association analysis identifies 13 new risk loci for schizophrenia. Nat. Genet. 45(10), 1150–1159 (2013)

    Article  Google Scholar 

  8. Reich, D.E., Cargill, M., Bolk, S., et al.: Linkage disequilibrium in the human genome. Nature 411(6834), 199–204 (2001)

    Article  Google Scholar 

  9. Pritchard, J.K., Przeworski, M.: Linkage disequilibrium in humans: models and data. Am. J. Hum. Genet. 69(1), 1–14 (2001)

    Article  Google Scholar 

  10. Browning, S.R.: Missing data imputation and haplotype phase inference for genome-wide association studies. Hum. Genet. 124(5), 439–450 (2008)

    Article  Google Scholar 

  11. Howie, B., Fuchsberger, C., Stephens, M., Marchini, J., Abecasis, G.R.: Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44(8), 955–959 (2012)

    Article  Google Scholar 

  12. Howie, B.N., Donnelly, P., Marchini, J.: A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5(6), e1000529 (2009)

    Article  Google Scholar 

  13. Li, Y., Willer, C., Sanna, S., Abecasis, G.: Genotype imputation. Annu. Rev. Genomics Hum. Genet. 10, 387–406 (2009)

    Article  Google Scholar 

  14. Li, Y., Willer, C.J., Ding, J., Scheet, P., Abecasis, G.R.: Mach: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet. Epidemiol 34(8), 816–834 (2010)

    Article  Google Scholar 

  15. Marchini, J., Howie, B.: Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 11(7), 499–511 (2010)

    Article  Google Scholar 

  16. Marchini, J., Howie, B.: Comparing algorithms for genotype imputation. Am. J. Hum. Genet. 83(4), 535–539 (2008). (author reply 539–540)

    Article  Google Scholar 

  17. Marchini, J., Howie, B., Myers, S., McVean, G., Donnelly, P.: A new multipoint method for genome-wide association studies by imputation of genotypes. Nat. Genet. 39(7), 906–913 (2007)

    Article  Google Scholar 

  18. Han, B., Kang, H.M., Eskin, E.: Rapid and accurate multiple testing correction and power estimation for millions of correlated markers. PLoS Genet. 5(4), e1000456 (2009)

    Article  Google Scholar 

  19. Kostem, E., Lozano, J.A., Eskin, E.: Increasing power of genome-wide association studies by collecting additional single-nucleotide polymorphisms. Genetics 188(2), 449–460 (2011)

    Article  Google Scholar 

  20. Hormozdiari, F., Kostem, E., Kang, E.Y., Pasaniuc, B., Eskin, E.: Identifying causal variants at loci with multiple signals of association. Genetics 198(2), 497–508 (2014)

    Article  Google Scholar 

  21. Hormozdiari, F., Kichaev, G., Yang, W.-Y., Pasaniuc, B., Eskin, E.: Identification of causal genes for complex traits. Bioinformatics 31(12), i206–i213 (2015)

    Article  Google Scholar 

  22. Hormozdiari, F., van de Bunt, M., Segrè, A.V., et al.: Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99(6), 1245–1260 (2016)

    Article  Google Scholar 

  23. Lee, D., Bigdeli, T.B., Riley, B.P., Fanous, A.H., Bacanu, S.A.: DIST: direct imputation of summary statistics for unmeasured SNPs. Bioinformatics 29(22), 2925–2927 (2013)

    Article  Google Scholar 

  24. Pasaniuc, B., Zaitlen, N., Shi, H., et al.: Fast and accurate imputation of summary statistics enhances evidence of functional enrichment. Bioinformatics 30(20), 2906–2914 (2014)

    Article  Google Scholar 

  25. Sabatti, C., Service, S.K., Hartikainen, A.-L., et al.: Genome-wide association analysis of metabolic traits in a birth cohort from a founder population. Nat. Genet. 41(1), 35–46 (2009)

    Article  Google Scholar 

  26. Durbin, R.M., Altshuler, D.L., Durbin, R.M., et al.: A map of human genome variation from population-scale sequencing. Nature 467(7319), 1061–1073 (2010)

    Article  Google Scholar 

  27. McVean, G.A., Altshuler, D.M., Durbin, R.M., et al.: An integrated map of genetic variation from 1,092 human genomes. Nature 491(7422), 56–65 (2012)

    Article  Google Scholar 

  28. Zaitlen, N., Kang, H.M., Eskin, E., Halperin, E.: Leveraging the hapmap correlation structure in association studies. Am. J. Hum. Genet. 80(4), 683–691 (2007)

    Article  Google Scholar 

  29. Joo, J.W.J., Hormozdiari, F., Han, B., Eskin, E.: Multiple testing correction in linear mixed models. Genome Biol. 17(1), 62 (2016)

    Article  Google Scholar 

  30. Devlin, B., Roeder, K.: Genomic control for association studies. Biometrics 55(4), 997–1004 (1999)

    Article  MATH  Google Scholar 

  31. Duong, D., Zou, J., Hormozdiari, F., et al.: Using genomic annotations increases statistical power to detect eGenes. Bioinformatics 32(12), i156–i163 (2016)

    Article  Google Scholar 

  32. Hormozdiari, F., Kang, E.Y., Bilow, M., et al.: Imputing phenotypes for genome-wide association studies. Am. J. Hum. Genet. 99(1), 89–103 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eleazar Eskin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Wu, Y., Hormozdiari, F., Joo, J.W.J., Eskin, E. (2017). Improving Imputation Accuracy by Inferring Causal Variants in Genetic Studies. In: Sahinalp, S. (eds) Research in Computational Molecular Biology. RECOMB 2017. Lecture Notes in Computer Science(), vol 10229. Springer, Cham. https://doi.org/10.1007/978-3-319-56970-3_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-56970-3_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-56969-7

  • Online ISBN: 978-3-319-56970-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics