Statistical Analysis of GWAS

Frommlet, Florian; Bogdan, Małgorzata; Ramsey, David

doi:10.1007/978-1-4471-5310-8_5

Florian Frommlet⁸,
Małgorzata Bogdan⁹ &
David Ramsey¹⁰

Part of the book series: Computational Biology ((COBO,volume 18))

2464 Accesses

Abstract

This chapter discusses the statistical analysis of genome-wide association studies. After briefly alluding to the topics of genotype calling and imputation of missing data, we will be mainly concerned with downstream analysis of association, where the genotypes of each individual at each SNP have been established. The classical approach of single marker tests combined with multiple testing correction is contrasted with different strategies of model selection, which tend to perform much better in terms of correctly identifying causal SNPs in the case of complex traits. Specific methods of handling rare SNPs, as well as population stratification, are discussed. The analysis of admixture mapping and gene–gene interactions are amongst the more advanced topics also considered here.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Affymetrix, Inc.: BRLMM: an Improved Genotype Calling Method for the GeneChip Human Mapping 500K Array Set. http://www.affymetrix.com/support/technical/whitepapers/brlmm_whitepaper.pdf (2006)
Alexander, D.H., Lange, K.: Enhancements to the ADMIXTURE algorithm for individual ancestry estimation. BMC Bioinform. 12, 246 (2011)
Article Google Scholar
Alexander, D., Novembre, J., Lange, K.: Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009)
Article Google Scholar
Andrew, A.S., Nelson, H.H., Kelsey, K.T., et al.: Concordance of multiple analytical approaches demonstrates a complex relationship between DNA repair gene SNPs, smoking and bladder cancer susceptibility. Carcinogenesis 27(5), 1030–1037 (2006)
Article Google Scholar
Asimit, J., Zeggini, E.: Rare variant association analysis methods for complex traits. Annu. Rev. Genet. 44, 293–308 (2010)
Article Google Scholar
Armitage, P.: Tests for linear trends in proportions and frequencies. Biometrics 11(3), 375–386 (1955)
Article Google Scholar
Balding, D.J.: A tutorial on statistical methods for population association studies. Nat. Rev. Gen. 7, 781–791 (2006)
Article Google Scholar
de Bakker, P.I., Yelensky, R., Pe’er, I., Gabriel, S.B., Daly, M.J., Altshuler, D.: Efficiency and power in genetic association studies. Nat. Genet. 37, 1217–1223 (2005)
Article Google Scholar
Bansal, V., Libiger, O., Torkamani, A., Schork, N.J.: Statistical analysis strategies for association studies involving rare variants. Nat. Rev. Genet. 11(11), 773–785 (2010)
Article Google Scholar
Barlow, R.E., Bartholomew, D.J., Bremner, J.M., Brunk, H.D.: Statistical Inference under Order Restrictions; the Theory and Application of Isotonic Regression. Wiley, New York (1972)
Google Scholar
Barrett, J.C., Fry, B., Maller, J., Daly, M.J.: Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265 (2005)
Article Google Scholar
Bazaraa, M., Shetty, C.: Nonlinear Programming: Theory and Algorithms. Wiley, New York (1979)
MATH Google Scholar
Beben, B., Visscher, P.M., McRae, A.F.: Family-based genome-wide association studies. Pharmacogenomics 20(2), 181–190 (2009)
Google Scholar
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57, 289–300 (1995)
MathSciNet MATH Google Scholar
Bogdan, M., Frommlet, F., Biecek, P., Cheng, R., Ghosh, J.K., Doerge, R.W.: Extending the modified Bayesian Information Criterion (mBIC) to dense markers and multiple interval mapping. Biometrics 64, 1162–1169 (2008)
Article MathSciNet MATH Google Scholar
Bogdan, M., Żak-Szatkowska, M., Ghosh, J.K.: Selecting explanatory variables with the modified version of Bayesian Information Criterion. Qual. Reliab. Eng. Int. 24, 627–641 (2008)
Article Google Scholar
Browning, S.R.: Missing data imputation and haplotype phase inference for genome-wide association studies. Hum. Genet. 124, 439–450 (2008)
Article Google Scholar
Browning, B.L., Yu, Z.: Simultaneous genotype calling and haplotype phase inference improves genotype accuracy and reduces false positive associations for genome-wide association studies. Am. J. Hum. Genet. 85, 847–861 (2009)
Article Google Scholar
Browning, B.L., Browning, S.R.: A unified approach to genotype imputation and haplotype phase inference for large data sets of trios and unrelated individuals. Am. J. Hum. Genet. 84, 210–223 (2009)
Article Google Scholar
Cantor, R.M., Lange, K., Sinsheimer, J.S.: Prioritizing GWAS results: A review of statistical methods and recommendations for their application. Am. J. Hum. Genet. 86(1), 6–22 (2010)
Article Google Scholar
Carlson, C.S., Eberle, M.A., Rieder, M.J., Yi, Q., Kruglyak, L., Nickerson, D.A.: Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 74(1), 106–120 (2004)
Article Google Scholar
Carvalho, B., Bengtsson, H., Speed, T.P., Irizarry, R.A.: Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data. Biostatistics 8, 485–499 (2007)
Article MATH Google Scholar
Carvalho, B.S., Irizarry, R.A.: A framework for oligonucleotide microarray preprocessing. Bioinformatics 26, 2363–2367 (2010)
Article Google Scholar
Chakraborty, R., Weiss, K.M.: Admixture as a tool for finding linked genes and detecting that difference from allelic association between loci. Proc. Nat. Acad. Sci. 85(23), 9119–9123 (1988)
Article Google Scholar
Chen, C.C.M., Schwender, H., Keith, J., Nunkesser, R., Mengersen, K., Macrossan, P.: Methods for identifying SNP interactions: a review on variations of logic regression, random forest and Bayesian logistic regression. IEEE/ACM Trans. Comput. Biol. Bioinf. 8(6), 1580–1591 (2011)
Article Google Scholar
Chen, J., Chen, Z.: Extended Bayesian Information criteria for model selection with large model spaces. Biometrika 95(3), 759–771 (2008)
Article MathSciNet MATH Google Scholar
Chen, J., Chen, Z.: Extended BIC for small \(n\)-large-\(P\) sparse GLM. www.stat.nus.edu.sg/~stachenz/ChenChen.pdf (2010)
Chen, J., Chen, Z.: Tournament screening cum EBIC for feature selection with high-dimensional feature spaces. Sci. China A: Math. 52(6), 1327–1341 (2009)
Article MATH MathSciNet Google Scholar
Chen, L., Yu, G., Langefeld, C.D., et al.: Comparative analysis of methods for detecting interacting loci. BMC Genomics 12(1), 344 (2011)
Article Google Scholar
Chipman, H., George, E.I., McCulloch, R.E.: The practical implementation of Bayesian model selection (with discussion). In: Lahiri, P. (ed.) Model Selection, pp. 66–134. IMS, Beachwood, OH (2001)
Google Scholar
Colditz, G.A., Hankinson, S.E.: The nurses’ health study: lifestyle and health among women. Nat. Rev. Cancer 5, 388–396 (2005)
Article Google Scholar
Consortium WTCCC: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007)
Google Scholar
Cordell, H.J.: Detecting gene-gene interactions that underlie human diseases. Nat. Rev. Genet. 10(6), 392–404 (2009)
Article Google Scholar
Dai, H., Bhandary, M., Becker, M., Leeder, J.S., Gaedigk, R., Motsinger-Reif, A.A.: Global tests of p-values for multifactor dimensionality reduction models in selection of optimal number of target genes biodata mining 5(1), 1–17 (2012)
Google Scholar
De, R., Verma, S.S., Holmes, M.V. et al.: Dissecting the obesity disease landscape: identifying gene-gene interactions that are highly associated with body mass index. In: 2014 8th International Conference on Systems Biology (ISB), 124–131. IEEE (2014)
Google Scholar
de Bakker, P.I., Yelensky, R., Pe’er, I., Gabriel, S.B., Daly, M.J., Altshuler, D.: Efficiency and power in genetic association studies. Nat. Genet. 37(11), 1217–1223 (2005)
Article Google Scholar
Devlin, B., Roeder, K.: Genomic control for association studies. Biometrics 55, 997–1004 (1999)
Article MATH Google Scholar
Di, X., Matsuzaki, H., Webster, T.A., Hubbell, E., Liu, G., Dong, S., Bartell, D., Huang, J., Chiles, R., Yang, G., Shen, M., Kulp, D., Kennedy, G.C., Mei, R., Jones, K.W., Cawley, S.: Dynamic model based algorithms for screening and genotyping over 100K SNPs on oligonucleotide microarrays. Bioinformatics 21, 1958–1963 (2005)
Article Google Scholar
Dolejsi, E., Bodenstorfer, B., Frommlet, F.: Analyzing genome-wide association studies with an FDR controlling modification of the Bayesian Information Criterion. PLoS One e103322 (2014)
Google Scholar
Dudbridge, F., Gusnanto, A.: Estimation of significance thresholds for genomewide association scans. Genet. Epid. 32, 227–234 (2008)
Article Google Scholar
Eichler, E.E., et al.: Missing heritability and strategies for finding the underlying causes of complex disease. Nat. Rev. Genet. 11, 446–450 (2010)
Article Google Scholar
Emily, M., Mailund, T., Hein, J., Schauser, L., Schierup, M.H.: Using biological networks to search for interacting loci in genome-wide association studies. Eur. J. Hum. Genet. 17(10), 1231–1240 (2009)
Article Google Scholar
Fan, J., Lv, J.: Sure independence screening for ultrahigh dimensional feature space. J. R. Statist. Soc. B 70, 849–911 (2008)
Article MathSciNet Google Scholar
Freidlin, B., Zheng, G., Li, Z., Gastwirth, J.L.: Trend tests for case-control studies of genetic markers: power, sample size and robustness. Hum. Hered. 53, 146–152 (2002)
Article Google Scholar
Friedman, N., Linial, M., Nachman, I., Pe’er, D.: Using Bayesian networks to analyze expression data. J. Comput. Biol. 7(3–4), 601–620 (2000)
Article Google Scholar
Frommlet, F.: Tag SNP selection based on clustering according to dominant sets found using replicator dynamics. Adv. Data Anal. Classif. 4, 65–83 (2010)
Article MathSciNet MATH Google Scholar
Frommlet, F., Chakrabarti, A., Murawska, M., Bogdan, M.: Asymptotic Bayes optimality under sparsity of selection rules for general priors. arXiv:1005.4753 (2010)
Frommlet, F., Ruhaltinger, F., Twarog, P., Bogdan, M.: Modified versions of Bayesian information criterion for genome-wide association studies. CSDA 56, 1038–1051 (2012)
MathSciNet Google Scholar
George, E.I., Foster, D.P.: Calibration and empirical Bayes variable selection. Biometrika 87, 731–747 (2000)
Article MathSciNet MATH Google Scholar
Griffin, J.E., Brown, P.J.: Bayesian adaptive lasso with non-convex penalization. Technical Report, University of Kent (2007)
Google Scholar
Gui, J., Moore, J.H., Williams, S.M., Andrews, P., Hillege, H.L., van der Harst, P., Navis, G., Van Gilst, W.H., Asselbergs, F.W., Gilbert-Diamond, D.: A simple and computationally efficient approach to multifactor dimensionality reduction analysis of gene-gene interactions for quantitative traits. PLoS One 8(6), e66545 (2013)
Article Google Scholar
Nature Consortium.: A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–862 (2007)
Google Scholar
Han, F., Pan, W.: A data-adaptive sum test for disease association with multiple common or rare variants. Hum. Hered. 70(1), 42–54 (2010)
Article Google Scholar
Hansen, M.H., Kooperberg, C.: Spline adaptation in extended linear models (with discussion). Stat. Sci. 17, 2–51 (2002)
Article MathSciNet MATH Google Scholar
He, Q., Lin, D.: A variable selection method for genome-wide association studies. Bioinformatics 27(1), 1–8 (2011)
Article MathSciNet MATH Google Scholar
Hindorff, L.A., Junkins, H.A., Hall, P.N., Mehta, J.P., Manolio, T.A.: A Catalog of Published Genome-Wide Association Studies. www.genome.gov/gwastudies
Hirschhorn, J.N., Daly, M.J.: Genome-wide association studies for common diseases and complex traits. Nat. Rev. Genet. 6(2), 95–108 (2005)
Article Google Scholar
Hoggart, C.J., Whittaker, J.C., De Iorio, M., Balding, D.J.: Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies. PLOS Genet. 4(7), e1000130 (2008). doi:10.1371/journal.pgen.1000130
Article Google Scholar
Hothorn, L.A., Hothorn, T.: Order-restricted scores test for the evaluation of population-based case-control studies when the genetic model is unknown. Biometrical J. 51(4), 659–669 (2009)
Article MathSciNet Google Scholar
Iyengar, S.K., Elston, R.C.: The genetic basis of complex traits: rare variants or “common gene, common disease”? Methods Mol. Biol. 376, 71–84 (2007)
Article Google Scholar
Kang, H.M., Zaitlen, N.A., Wade, C.M., Kirby, A., Heckerman, D., Daly, M.J., Eskin, E.: Efficient control of population structure in model organism association mapping. Genetics 178(3), 1709–1723 (2008)
Article Google Scholar
Kang, H.M., Sul, J.H., Service, S.K., Zaitlen, N.A., Kong, S.Y., Freimer, N.B., Sabatti C., Eskin, E.: Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42(4), 348–354 (2010)
Google Scholar
Kennedy, G.C., Matsuzaki, H., Dong, S., Liu, W.M., Huang, J., Liu, G., Su, X., Cao, M., Chen, W., Zhang, J., Liu, W., Yang, G., Di, X., Ryder, T., He, Z., Surti, U., Phillips, M.S., Boyce-Jacino, M.T., Fodor, S.P., Jones, K.W.: Large-scale genotyping of complex DNA. Nat. Biotechnol. 21, 1233–1237 (2003)
Article Google Scholar
Kooperberg, C., LeBlanc, M., Obenchain, V.: Risk prediction using genome-wide association studies. Genet. Epidem. 34, 643–652 (2010)
Article Google Scholar
Kooperberg, C., Ruczinski, I.: Identifying interacting SNPs using Monte Carlo logic regression. Genet. Epidemiol. 28(2), 157–170 (2005)
Article Google Scholar
Koren, M., Kimmel, G., Ben-Asher, E., Gal, I., Papa, M.Z., Beckmann, J.S., Lancet, D., Shamir, R., Friedman, E.: ATM haplotypes and breast cancer risk in Jewish high-risk women. Br. J. Cancer. 94(10), 1537–1543 (2006)
Article Google Scholar
Lao, O., et al.: Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Curr. Biol. 18(16), 1241–1248 (2008)
Article MathSciNet Google Scholar
Laurie, C.L., et al.: Quality control and quality assurance in genotypic data for genome-wide association studies. Genet. Epidemiol. 34, 591–602 (2010)
Article Google Scholar
Li, J., Das, K., Fu, G., Li, R., Wu, R.: The Bayesian Lasso for genome-wide association studies. Bioinformatics 27(4), 516–523 (2010)
Article Google Scholar
Li, B., Leal, S.M.: Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83(3), 311–321 (2008)
Article Google Scholar
Lin, S., Carvalho, B., Cutler, D.J., Arking, D.E., Chakravarti, A., Irizarry, R.A.: Validation and extension of an empirical Bayes method for SNP calling on affymetrix microarrays. Genome Biol. 9, R63 (2008)
Google Scholar
Lippert, C., Listgarten, J., Liu, Y., Kadie, C.M., Davidson, R.I., Heckerman, D.: FaST linear mixed models for genome-wide association studies. Nat. Methods 8(10), 833–835 (2011)
Article Google Scholar
Liu, W., Di, X., Yang, G., Matsuzaki, H., Huang, J., Mei, R., Ryder, T.B., Webster, T.A., Dong, S., Liu, G., Jones, K.W., Kennedy, G.C., Kulp, D.: Algorithms for large-scale genotyping microarrays. Bioinformatics 19, 2397–2403 (2003)
Article Google Scholar
Long, J.C.: The genetic structure of admixed populations. Genetics 127, 417–428 (1991)
Google Scholar
Lou, X.Y., Chen, G.B., Yan, L., Ma, J.Z., Zhu, J., et al.: A generalized combinatorial approach for detecting gene-by-gene and gene-by-environment interactions with application to nicotine dependence. Am. J. Hum. Genet. 80, 1125–1137 (2007)
Article Google Scholar
Manolio, T.A., et al.: Finding the missing heritability of complex diseases. Nature 461(7265), 747–753 (2009)
Article Google Scholar
Marchini, J., Donnelly, P., Cardon, L.R.: Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat. Genet. 37(4), 413–417 (2005)
Article Google Scholar
Marchini, J., Howie, B.: Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 11, 499–511 (2010)
Article Google Scholar
McCarthy, M.I., Abecasis, G.R., Cardon, L.R., Goldstein, D.B., Little, J., Ioannidis, J.P., Hirschhorn, J.N.: Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat. Rev. Genet. 9(5), 356–369 (2008)
Article Google Scholar
McCarthy, M.I., Hirschhorn, J.N.: Genome-wide association studies: potential next steps on a genetic journey. Hum. Mol. Genet. 17, R156–R165 (2008)
Article Google Scholar
McCullagh, P., Nelder, J.A.: Generalized Linear Models, 2nd edn. Chapman and Hall/CRC, Boca Raton (1989)
Book MATH Google Scholar
McKeigue, P.M.: Mapping genes underlying ethnic differences in disease risk by linkage disequilibrium in recently admixed populations. Am. J. Hum. Genet. 60(1), 188 (1997)
Google Scholar
Meinshausen, N., Bhlmann, P.: Stability selection. JRSSB 72, 417–448 (2010)
Article MathSciNet Google Scholar
Menozzi, P., Piazza, A., Cavalli-Sforza, L.: Synthetic maps of human gene frequencies in Europeans. Science 201, 786–792 (1978)
Article Google Scholar
Miller, D.J., Zhang, Y., Yu, G.: An algorithm for learning maximum entropy probability models of disease risk that efficiently searches and sparingly encodes multilocus genomic interactions. Bioinformatics 25(19), 2478–2485 (2009)
Article Google Scholar
Moore, J.H.: The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum. Hered. 56, 73–82 (2003)
Article Google Scholar
Moore, J.H., Gilbert, J.C., Tsai, C.T., Chiang, F.T., Holden, T., Barney, N., White, B.C.: A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. J. Theor. Biol. 241(2), 252–261 (2006)
Article MathSciNet Google Scholar
Morgenthaler, S., Thilly, W.G.: A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (CAST). Mutat. Res. 615(1–2), 28–56 (2007)
Article Google Scholar
National Center for Biotechnology Information, United States National Library of Medicine. NCBI dbSNP build 144 for human. Summary Page. http://www.ncbi.nlm.nih.gov/projects/SNP/snp_summary.cgi?view+summary=view+summary&build_id=144. Accessed 26 Aug 2015
Nelson, M.R., et al.: The population reference sample, POPRES: a resource for population, disease, and pharmacological genetics research. Am. J. Hum. Genet. 83, 347–358 (2008)
Article Google Scholar
Ouwehand, W.H.: The discovery of genes implicated in myocardial infarction. J. Thromb. Haemost. 7(Suppl 1), 305–307 (2009)
Article Google Scholar
Park, T., Casella, G.: The Bayesian Lasso. JASA 103, 681–686 (2008)
Article MathSciNet MATH Google Scholar
Pattin, K.A., White, B.C., Barney, N., et al.: A computationally efficient hypothesis testing method for epistasis analysis using multifactor dimensionality reduction. Genet. Epidemi. 33(1), 87–94 (2009)
Article Google Scholar
Pierce, J.R.: An Introduction to Information Theory: Symbols, Signals, and Noise. Dover, New York (1980)
Google Scholar
Potkin, S.G., Turner, J.A., Guffanti, G., Lakatos, A., Torri, F., Keator, D.B., Macciardi, F.: Genome-wide strategies for discovering genetic influences on cognition and cognitive disorders: methodological considerations. Cogn. Neuropsychiatry 14(4/5), 391–418 (2009)
Article Google Scholar
Pritchard, J.K., Rosenberg, N.A.: Use of unlinked genetic markers to detect population stratification in association studies. Am. J. Hum. Genet. 65, 220–228 (1999)
Article Google Scholar
Pritchard, J., Stephens, M., Donnelly, P.: Inference of population structure using multilocus genotype data. Genetics 155(2), 945 (2000)
Google Scholar
Price, A.L., et al.: Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006)
Article Google Scholar
Price, A.L., Patterson, N., Yu, F., et al.: A genomewide admixture map for Latino populations. Am. J. Hum. Genet. 80(6), 1024–1036 (2007)
Article Google Scholar
Price, A.L., Tandon, A., Patterson, N., Barnes, K.C., Rafaels, N., Ruczinski, I., Beatty, T.H., Mathias, R., Reich, D., Myers, S.: Sensitive detection of chromosomal segments of distinct ancestry in admixed populations. PLoS Genet. 5(6), e1000519 (2009)
Article Google Scholar
Price, A.L., Zaitlen, N.A., Reich, D., Patterson, N.: New approaches to population stratification in genome-wide association studies. Nat. Rev. Genet. 11(7), 459–463 (2010)
Article Google Scholar
Purcell, S., Neale, B., Todd-Brown, K., et al.: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81(3), 559–575 (2007)
Article Google Scholar
Rabbee, N., Speed, T.P.: A genotype calling algorithm for affymetrix SNP arrays. Bioinformatics 22, 7–12 (2006)
Article Google Scholar
Redden, D.T., Divers, J., Vaughan, L.K., et al.: Regional admixture mapping and structured association testing: conceptual unification and an extensible general linear model. PLoS Genet. 2, e137 (2006)
Article Google Scholar
Reich, D.E., Goldstein, D.B.: Detecting association in a case-control study while correcting for population stratification. Genet. Epidemiol. 20, 4–16 (2001)
Article Google Scholar
Ritchie, M.E., Carvalho, B.S., Hetrick, K.N., Tavaré, S., Irizarry, R.A.: R/Bioconductor software for Illumina’s Infinium whole-genome genotyping BeadChips. Bioinformatics 25, 2621–2623 (2009)
Article Google Scholar
Ritchie, M.D., Hahn, L.W., Roodi, N., Bailey, L.R., Dupont, W.D., Parl, F.F., Moore, J.H.: Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am. J. Hum. Genet. 69(1), 138–147 (2001)
Article Google Scholar
Riveros, C., Vimieiro, R., Holliday, E.G.: Identification of Genome-Wide SNP-SNP and SNP-Clinical Boolean Interactions in Age-Related Macular Degeneration In Epistasis, 217–255. Springer, New York (2015)
Google Scholar
Robertson, T., Wright, F.T., Dykstra, R.L.: Order Restricted Statistical Inference. Wiley, New York (1988)
Google Scholar
Nature Genetics Genome-wide association analysis of metabolic traits in a birth cohort from a founder population. 41(1), 35–46 (2009)
Google Scholar
Sampson, J.N., Zhao, H.: Genotyping and inflated type I error rate in genome-wide association case/control studies. BMC Bioinform. 10, 68 (2009)
Google Scholar
Sasieni, P.D.: From genotypes to genes: doubling the sample size. Biometrics 53, 1253–1261 (1997)
Article MathSciNet MATH Google Scholar
Scheet, P., Stephens, M.: A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet. 78, 629–644 (2006)
Article Google Scholar
Schwender, H., Ickstadt, K.: Identification of SNP interactions using logic regression. Biostatistics 9(1), 187–198 (2008)
Article MATH Google Scholar
Schwender, H., Ruczinski, I., Ickstadt, K.: Testing SNPs and sets of SNPs for importance in association studies. Biostatistics (2010). doi:10.1093/biostatistics/kxq042
Google Scholar
Segura, V., Vilhjalmsson, B.J., Platt, A., Korte, A., Seren, Ü., Long, Q., Nordborg, M.: An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations. Nat. Genet. 44(7), 825–830 (2012)
Article Google Scholar
Setakis, E., Stirnadel, H., Balding, D.J.: Logistic regression protects against population structure in genetic association studies. Genome Res. 16, 290–296 (2006)
Article Google Scholar
Spielman, R.S., McGinnis, R.E., Ewens, W.J.: Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am. J. Hum. Genet. 52(3), 506–516 (1993)
Google Scholar
Stranger, B.E., Nica, A.C., Forrest, M.S., Dimas, A., Bird, C.P., Beazley, C., Ingle, C.E., Dunning, M., Flicek, P., Montgomery, S., Tavaré, S., Deloukas, P., Dermitzakis, E.T.: Population genomics of human gene expression. Nat. Genet. 39, 1217–1224 (2007)
Article Google Scholar
Szulc, P., Bogdan, M., Frommlet, F., Tang H.: Joint Genotype- and Ancestry-based Genome-wide Association Studies in Admixed Populations. Working Paper (2015)
Google Scholar
Tang, H., Siegmund, D.O., Johnson, N.A., Romieu, I., London, S.J.: Joint testing of genotype and ancestry association in admixed families. Genet. Epidemiol. 34(8), 783–791 (2010)
Article Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. B 58(1), 267–288 (1996)
MathSciNet MATH Google Scholar
Via, M., Gignoux, C., Burchard, E.G.: The 1000 genomes project: new opportunities for research and social challenges. Genome Med. 2, 3 (2010)
Article Google Scholar
Wei, Z., Sun, W., Wang, K., Hakonarson, H.: Multiple testing in genome-wide association studies via hidden Markov models. Bioinformatics 25(21), 2802–2808 (2009)
Article Google Scholar
Wolf, B.J., Hill, E.G., Slate, E.H.: Logic forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics 26(17), 2183–2189 (2010)
Article Google Scholar
Wu, T.T., Chen, Y.F., Hastie, T., Sobel, E., Lange, K.: Genome-wide association analysis by lasso penalized logistic regression. Bioinformatics 25(6), 714–721 (2009)
Google Scholar
Yang, C., He, Z., Wan, X., Yang, Q., Xue, H., Yu, W.: SNPHarvester: a filtering-based approach for detecting epistatic interactions in genome-wide association studies. Bioinformatics 25(4), 504–511 (2009)
Article Google Scholar
Yang, J., et al.: Common SNPs explain a large proportion of heritability for human height. Nat. Genet. 42, 565–569 (2010)
Article Google Scholar
Yu, J., Pressoir, G., Briggs, W.H., Vroh Bi, I., Yamasaki, M., Doebley, J.F., McMullen, M.D., Gaut, B.S., Nielsen, D.M., Holland, J.B., Kresovich, S., Buckler, E.S.: A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38(2), 203–208 (2006)
Google Scholar
Żak-Szatkowska, M., Bogdan, M.: Modified versions of Bayesian information criterion for sparse generalized linear models. CSDA. In Press, Accepted Manuscript (2012)
Google Scholar
Zehetmayer, S., Posch, M.: False discovery rate control in two-stage designs. BMC Bioinform. 613, 81 (2012). doi:10.1186/1471-2105-13-81
Zhang, Y., Liu, J.S.: Bayesian inference of epistatic interactions in case-control studies. Nat. Genet. 39(9), 1167–1173 (2007)
Article Google Scholar
Zhao, J., Chen, Z.: A two-stage penalized logistic regression approach to case-control genome-wide association studies. www.stat.nus.edu.sg/~stachenz/MS091221PR.pdf (2010)
Ziegler, A., König, I.R., Thompson, J.R.: Biostatistical aspects of genome-wide association studies. Biometrical J. 50(1), 8–28 (2008)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Center for Medical Statistics, Informatics, and Intelligent Systems Section for Medical Statistics, Medical University of Vienna, Spitalgasse 23, 1090, Vienna, Austria
Florian Frommlet
Institute of Mathematics, University of Wrocław, Wrocław, Poland
Małgorzata Bogdan
Department of Operations Research, Wrocław University of Technology, Wrocław, Poland
David Ramsey

Authors

Florian Frommlet
View author publications
You can also search for this author in PubMed Google Scholar
Małgorzata Bogdan
View author publications
You can also search for this author in PubMed Google Scholar
David Ramsey
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Florian Frommlet .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Frommlet, F., Bogdan, M., Ramsey, D. (2016). Statistical Analysis of GWAS. In: Phenotypes and Genotypes. Computational Biology, vol 18. Springer, London. https://doi.org/10.1007/978-1-4471-5310-8_5

Download citation

DOI: https://doi.org/10.1007/978-1-4471-5310-8_5
Published: 13 February 2016
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5309-2
Online ISBN: 978-1-4471-5310-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics