Statistical Analysis of GWAS

  • Florian FrommletEmail author
  • Małgorzata Bogdan
  • David Ramsey
Part of the Computational Biology book series (COBO, volume 18)


This chapter discusses the statistical analysis of genome-wide association studies. After briefly alluding to the topics of genotype calling and imputation of missing data, we will be mainly concerned with downstream analysis of association, where the genotypes of each individual at each SNP have been established. The classical approach of single marker tests combined with multiple testing correction is contrasted with different strategies of model selection, which tend to perform much better in terms of correctly identifying causal SNPs in the case of complex traits. Specific methods of handling rare SNPs, as well as population stratification, are discussed. The analysis of admixture mapping and gene–gene interactions are amongst the more advanced topics also considered here.


Minor Allele Frequency Rare Variant Dominance Effect Multifactor Dimensionality Reduction Transmission Disequilibrium Test 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Affymetrix, Inc.: BRLMM: an Improved Genotype Calling Method for the GeneChip Human Mapping 500K Array Set. (2006)
  2. 2.
    Alexander, D.H., Lange, K.: Enhancements to the ADMIXTURE algorithm for individual ancestry estimation. BMC Bioinform. 12, 246 (2011)CrossRefGoogle Scholar
  3. 3.
    Alexander, D., Novembre, J., Lange, K.: Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009)CrossRefGoogle Scholar
  4. 4.
    Andrew, A.S., Nelson, H.H., Kelsey, K.T., et al.: Concordance of multiple analytical approaches demonstrates a complex relationship between DNA repair gene SNPs, smoking and bladder cancer susceptibility. Carcinogenesis 27(5), 1030–1037 (2006)CrossRefGoogle Scholar
  5. 5.
    Asimit, J., Zeggini, E.: Rare variant association analysis methods for complex traits. Annu. Rev. Genet. 44, 293–308 (2010)CrossRefGoogle Scholar
  6. 6.
    Armitage, P.: Tests for linear trends in proportions and frequencies. Biometrics 11(3), 375–386 (1955)CrossRefGoogle Scholar
  7. 7.
    Balding, D.J.: A tutorial on statistical methods for population association studies. Nat. Rev. Gen. 7, 781–791 (2006)CrossRefGoogle Scholar
  8. 8.
    de Bakker, P.I., Yelensky, R., Pe’er, I., Gabriel, S.B., Daly, M.J., Altshuler, D.: Efficiency and power in genetic association studies. Nat. Genet. 37, 1217–1223 (2005)CrossRefGoogle Scholar
  9. 9.
    Bansal, V., Libiger, O., Torkamani, A., Schork, N.J.: Statistical analysis strategies for association studies involving rare variants. Nat. Rev. Genet. 11(11), 773–785 (2010)CrossRefGoogle Scholar
  10. 10.
    Barlow, R.E., Bartholomew, D.J., Bremner, J.M., Brunk, H.D.: Statistical Inference under Order Restrictions; the Theory and Application of Isotonic Regression. Wiley, New York (1972)Google Scholar
  11. 11.
    Barrett, J.C., Fry, B., Maller, J., Daly, M.J.: Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265 (2005)CrossRefGoogle Scholar
  12. 12.
    Bazaraa, M., Shetty, C.: Nonlinear Programming: Theory and Algorithms. Wiley, New York (1979)zbMATHGoogle Scholar
  13. 13.
    Beben, B., Visscher, P.M., McRae, A.F.: Family-based genome-wide association studies. Pharmacogenomics 20(2), 181–190 (2009)Google Scholar
  14. 14.
    Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57, 289–300 (1995)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Bogdan, M., Frommlet, F., Biecek, P., Cheng, R., Ghosh, J.K., Doerge, R.W.: Extending the modified Bayesian Information Criterion (mBIC) to dense markers and multiple interval mapping. Biometrics 64, 1162–1169 (2008)MathSciNetzbMATHCrossRefGoogle Scholar
  16. 16.
    Bogdan, M., Żak-Szatkowska, M., Ghosh, J.K.: Selecting explanatory variables with the modified version of Bayesian Information Criterion. Qual. Reliab. Eng. Int. 24, 627–641 (2008)CrossRefGoogle Scholar
  17. 17.
    Browning, S.R.: Missing data imputation and haplotype phase inference for genome-wide association studies. Hum. Genet. 124, 439–450 (2008)CrossRefGoogle Scholar
  18. 18.
    Browning, B.L., Yu, Z.: Simultaneous genotype calling and haplotype phase inference improves genotype accuracy and reduces false positive associations for genome-wide association studies. Am. J. Hum. Genet. 85, 847–861 (2009)CrossRefGoogle Scholar
  19. 19.
    Browning, B.L., Browning, S.R.: A unified approach to genotype imputation and haplotype phase inference for large data sets of trios and unrelated individuals. Am. J. Hum. Genet. 84, 210–223 (2009)CrossRefGoogle Scholar
  20. 20.
    Cantor, R.M., Lange, K., Sinsheimer, J.S.: Prioritizing GWAS results: A review of statistical methods and recommendations for their application. Am. J. Hum. Genet. 86(1), 6–22 (2010)CrossRefGoogle Scholar
  21. 21.
    Carlson, C.S., Eberle, M.A., Rieder, M.J., Yi, Q., Kruglyak, L., Nickerson, D.A.: Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 74(1), 106–120 (2004)CrossRefGoogle Scholar
  22. 22.
    Carvalho, B., Bengtsson, H., Speed, T.P., Irizarry, R.A.: Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data. Biostatistics 8, 485–499 (2007)zbMATHCrossRefGoogle Scholar
  23. 23.
    Carvalho, B.S., Irizarry, R.A.: A framework for oligonucleotide microarray preprocessing. Bioinformatics 26, 2363–2367 (2010)CrossRefGoogle Scholar
  24. 24.
    Chakraborty, R., Weiss, K.M.: Admixture as a tool for finding linked genes and detecting that difference from allelic association between loci. Proc. Nat. Acad. Sci. 85(23), 9119–9123 (1988)CrossRefGoogle Scholar
  25. 25.
    Chen, C.C.M., Schwender, H., Keith, J., Nunkesser, R., Mengersen, K., Macrossan, P.: Methods for identifying SNP interactions: a review on variations of logic regression, random forest and Bayesian logistic regression. IEEE/ACM Trans. Comput. Biol. Bioinf. 8(6), 1580–1591 (2011)CrossRefGoogle Scholar
  26. 26.
    Chen, J., Chen, Z.: Extended Bayesian Information criteria for model selection with large model spaces. Biometrika 95(3), 759–771 (2008)MathSciNetzbMATHCrossRefGoogle Scholar
  27. 27.
    Chen, J., Chen, Z.: Extended BIC for small \(n\)-large-\(P\) sparse GLM. (2010)
  28. 28.
    Chen, J., Chen, Z.: Tournament screening cum EBIC for feature selection with high-dimensional feature spaces. Sci. China A: Math. 52(6), 1327–1341 (2009)zbMATHMathSciNetCrossRefGoogle Scholar
  29. 29.
    Chen, L., Yu, G., Langefeld, C.D., et al.: Comparative analysis of methods for detecting interacting loci. BMC Genomics 12(1), 344 (2011)CrossRefGoogle Scholar
  30. 30.
    Chipman, H., George, E.I., McCulloch, R.E.: The practical implementation of Bayesian model selection (with discussion). In: Lahiri, P. (ed.) Model Selection, pp. 66–134. IMS, Beachwood, OH (2001)Google Scholar
  31. 31.
    Colditz, G.A., Hankinson, S.E.: The nurses’ health study: lifestyle and health among women. Nat. Rev. Cancer 5, 388–396 (2005)CrossRefGoogle Scholar
  32. 32.
    Consortium WTCCC: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007)Google Scholar
  33. 33.
    Cordell, H.J.: Detecting gene-gene interactions that underlie human diseases. Nat. Rev. Genet. 10(6), 392–404 (2009)CrossRefGoogle Scholar
  34. 34.
    Dai, H., Bhandary, M., Becker, M., Leeder, J.S., Gaedigk, R., Motsinger-Reif, A.A.: Global tests of p-values for multifactor dimensionality reduction models in selection of optimal number of target genes biodata mining 5(1), 1–17 (2012)Google Scholar
  35. 35.
    De, R., Verma, S.S., Holmes, M.V. et al.: Dissecting the obesity disease landscape: identifying gene-gene interactions that are highly associated with body mass index. In: 2014 8th International Conference on Systems Biology (ISB), 124–131. IEEE (2014)Google Scholar
  36. 36.
    de Bakker, P.I., Yelensky, R., Pe’er, I., Gabriel, S.B., Daly, M.J., Altshuler, D.: Efficiency and power in genetic association studies. Nat. Genet. 37(11), 1217–1223 (2005)CrossRefGoogle Scholar
  37. 37.
    Devlin, B., Roeder, K.: Genomic control for association studies. Biometrics 55, 997–1004 (1999)zbMATHCrossRefGoogle Scholar
  38. 38.
    Di, X., Matsuzaki, H., Webster, T.A., Hubbell, E., Liu, G., Dong, S., Bartell, D., Huang, J., Chiles, R., Yang, G., Shen, M., Kulp, D., Kennedy, G.C., Mei, R., Jones, K.W., Cawley, S.: Dynamic model based algorithms for screening and genotyping over 100K SNPs on oligonucleotide microarrays. Bioinformatics 21, 1958–1963 (2005)CrossRefGoogle Scholar
  39. 39.
    Dolejsi, E., Bodenstorfer, B., Frommlet, F.: Analyzing genome-wide association studies with an FDR controlling modification of the Bayesian Information Criterion. PLoS One e103322 (2014)Google Scholar
  40. 40.
    Dudbridge, F., Gusnanto, A.: Estimation of significance thresholds for genomewide association scans. Genet. Epid. 32, 227–234 (2008)CrossRefGoogle Scholar
  41. 41.
    Eichler, E.E., et al.: Missing heritability and strategies for finding the underlying causes of complex disease. Nat. Rev. Genet. 11, 446–450 (2010)CrossRefGoogle Scholar
  42. 42.
    Emily, M., Mailund, T., Hein, J., Schauser, L., Schierup, M.H.: Using biological networks to search for interacting loci in genome-wide association studies. Eur. J. Hum. Genet. 17(10), 1231–1240 (2009)CrossRefGoogle Scholar
  43. 43.
    Fan, J., Lv, J.: Sure independence screening for ultrahigh dimensional feature space. J. R. Statist. Soc. B 70, 849–911 (2008)MathSciNetCrossRefGoogle Scholar
  44. 44.
    Freidlin, B., Zheng, G., Li, Z., Gastwirth, J.L.: Trend tests for case-control studies of genetic markers: power, sample size and robustness. Hum. Hered. 53, 146–152 (2002)CrossRefGoogle Scholar
  45. 45.
    Friedman, N., Linial, M., Nachman, I., Pe’er, D.: Using Bayesian networks to analyze expression data. J. Comput. Biol. 7(3–4), 601–620 (2000)CrossRefGoogle Scholar
  46. 46.
    Frommlet, F.: Tag SNP selection based on clustering according to dominant sets found using replicator dynamics. Adv. Data Anal. Classif. 4, 65–83 (2010)MathSciNetzbMATHCrossRefGoogle Scholar
  47. 47.
    Frommlet, F., Chakrabarti, A., Murawska, M., Bogdan, M.: Asymptotic Bayes optimality under sparsity of selection rules for general priors. arXiv:1005.4753 (2010)
  48. 48.
    Frommlet, F., Ruhaltinger, F., Twarog, P., Bogdan, M.: Modified versions of Bayesian information criterion for genome-wide association studies. CSDA 56, 1038–1051 (2012)MathSciNetGoogle Scholar
  49. 49.
    George, E.I., Foster, D.P.: Calibration and empirical Bayes variable selection. Biometrika 87, 731–747 (2000)MathSciNetzbMATHCrossRefGoogle Scholar
  50. 50.
    Griffin, J.E., Brown, P.J.: Bayesian adaptive lasso with non-convex penalization. Technical Report, University of Kent (2007)Google Scholar
  51. 51.
    Gui, J., Moore, J.H., Williams, S.M., Andrews, P., Hillege, H.L., van der Harst, P., Navis, G., Van Gilst, W.H., Asselbergs, F.W., Gilbert-Diamond, D.: A simple and computationally efficient approach to multifactor dimensionality reduction analysis of gene-gene interactions for quantitative traits. PLoS One 8(6), e66545 (2013)CrossRefGoogle Scholar
  52. 52.
    Nature Consortium.: A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–862 (2007)Google Scholar
  53. 53.
    Han, F., Pan, W.: A data-adaptive sum test for disease association with multiple common or rare variants. Hum. Hered. 70(1), 42–54 (2010)CrossRefGoogle Scholar
  54. 54.
    Hansen, M.H., Kooperberg, C.: Spline adaptation in extended linear models (with discussion). Stat. Sci. 17, 2–51 (2002)MathSciNetzbMATHCrossRefGoogle Scholar
  55. 55.
    He, Q., Lin, D.: A variable selection method for genome-wide association studies. Bioinformatics 27(1), 1–8 (2011)MathSciNetzbMATHCrossRefGoogle Scholar
  56. 56.
    Hindorff, L.A., Junkins, H.A., Hall, P.N., Mehta, J.P., Manolio, T.A.: A Catalog of Published Genome-Wide Association Studies.
  57. 57.
    Hirschhorn, J.N., Daly, M.J.: Genome-wide association studies for common diseases and complex traits. Nat. Rev. Genet. 6(2), 95–108 (2005)CrossRefGoogle Scholar
  58. 58.
    Hoggart, C.J., Whittaker, J.C., De Iorio, M., Balding, D.J.: Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies. PLOS Genet. 4(7), e1000130 (2008). doi: 10.1371/journal.pgen.1000130 CrossRefGoogle Scholar
  59. 59.
    Hothorn, L.A., Hothorn, T.: Order-restricted scores test for the evaluation of population-based case-control studies when the genetic model is unknown. Biometrical J. 51(4), 659–669 (2009)MathSciNetCrossRefGoogle Scholar
  60. 60.
    Iyengar, S.K., Elston, R.C.: The genetic basis of complex traits: rare variants or “common gene, common disease”? Methods Mol. Biol. 376, 71–84 (2007)CrossRefGoogle Scholar
  61. 61.
    Kang, H.M., Zaitlen, N.A., Wade, C.M., Kirby, A., Heckerman, D., Daly, M.J., Eskin, E.: Efficient control of population structure in model organism association mapping. Genetics 178(3), 1709–1723 (2008)CrossRefGoogle Scholar
  62. 62.
    Kang, H.M., Sul, J.H., Service, S.K., Zaitlen, N.A., Kong, S.Y., Freimer, N.B., Sabatti C., Eskin, E.: Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42(4), 348–354 (2010)Google Scholar
  63. 63.
    Kennedy, G.C., Matsuzaki, H., Dong, S., Liu, W.M., Huang, J., Liu, G., Su, X., Cao, M., Chen, W., Zhang, J., Liu, W., Yang, G., Di, X., Ryder, T., He, Z., Surti, U., Phillips, M.S., Boyce-Jacino, M.T., Fodor, S.P., Jones, K.W.: Large-scale genotyping of complex DNA. Nat. Biotechnol. 21, 1233–1237 (2003)CrossRefGoogle Scholar
  64. 64.
    Kooperberg, C., LeBlanc, M., Obenchain, V.: Risk prediction using genome-wide association studies. Genet. Epidem. 34, 643–652 (2010)CrossRefGoogle Scholar
  65. 65.
    Kooperberg, C., Ruczinski, I.: Identifying interacting SNPs using Monte Carlo logic regression. Genet. Epidemiol. 28(2), 157–170 (2005)CrossRefGoogle Scholar
  66. 66.
    Koren, M., Kimmel, G., Ben-Asher, E., Gal, I., Papa, M.Z., Beckmann, J.S., Lancet, D., Shamir, R., Friedman, E.: ATM haplotypes and breast cancer risk in Jewish high-risk women. Br. J. Cancer. 94(10), 1537–1543 (2006)CrossRefGoogle Scholar
  67. 67.
    Lao, O., et al.: Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Curr. Biol. 18(16), 1241–1248 (2008)MathSciNetCrossRefGoogle Scholar
  68. 68.
    Laurie, C.L., et al.: Quality control and quality assurance in genotypic data for genome-wide association studies. Genet. Epidemiol. 34, 591–602 (2010)CrossRefGoogle Scholar
  69. 69.
    Li, J., Das, K., Fu, G., Li, R., Wu, R.: The Bayesian Lasso for genome-wide association studies. Bioinformatics 27(4), 516–523 (2010)CrossRefGoogle Scholar
  70. 70.
    Li, B., Leal, S.M.: Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83(3), 311–321 (2008)CrossRefGoogle Scholar
  71. 71.
    Lin, S., Carvalho, B., Cutler, D.J., Arking, D.E., Chakravarti, A., Irizarry, R.A.: Validation and extension of an empirical Bayes method for SNP calling on affymetrix microarrays. Genome Biol. 9, R63 (2008)Google Scholar
  72. 72.
    Lippert, C., Listgarten, J., Liu, Y., Kadie, C.M., Davidson, R.I., Heckerman, D.: FaST linear mixed models for genome-wide association studies. Nat. Methods 8(10), 833–835 (2011)CrossRefGoogle Scholar
  73. 73.
    Liu, W., Di, X., Yang, G., Matsuzaki, H., Huang, J., Mei, R., Ryder, T.B., Webster, T.A., Dong, S., Liu, G., Jones, K.W., Kennedy, G.C., Kulp, D.: Algorithms for large-scale genotyping microarrays. Bioinformatics 19, 2397–2403 (2003)CrossRefGoogle Scholar
  74. 74.
    Long, J.C.: The genetic structure of admixed populations. Genetics 127, 417–428 (1991)Google Scholar
  75. 75.
    Lou, X.Y., Chen, G.B., Yan, L., Ma, J.Z., Zhu, J., et al.: A generalized combinatorial approach for detecting gene-by-gene and gene-by-environment interactions with application to nicotine dependence. Am. J. Hum. Genet. 80, 1125–1137 (2007)CrossRefGoogle Scholar
  76. 76.
    Manolio, T.A., et al.: Finding the missing heritability of complex diseases. Nature 461(7265), 747–753 (2009)CrossRefGoogle Scholar
  77. 77.
    Marchini, J., Donnelly, P., Cardon, L.R.: Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat. Genet. 37(4), 413–417 (2005)CrossRefGoogle Scholar
  78. 78.
    Marchini, J., Howie, B.: Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 11, 499–511 (2010)CrossRefGoogle Scholar
  79. 79.
    McCarthy, M.I., Abecasis, G.R., Cardon, L.R., Goldstein, D.B., Little, J., Ioannidis, J.P., Hirschhorn, J.N.: Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat. Rev. Genet. 9(5), 356–369 (2008)CrossRefGoogle Scholar
  80. 80.
    McCarthy, M.I., Hirschhorn, J.N.: Genome-wide association studies: potential next steps on a genetic journey. Hum. Mol. Genet. 17, R156–R165 (2008)CrossRefGoogle Scholar
  81. 81.
    McCullagh, P., Nelder, J.A.: Generalized Linear Models, 2nd edn. Chapman and Hall/CRC, Boca Raton (1989)zbMATHCrossRefGoogle Scholar
  82. 82.
    McKeigue, P.M.: Mapping genes underlying ethnic differences in disease risk by linkage disequilibrium in recently admixed populations. Am. J. Hum. Genet. 60(1), 188 (1997)Google Scholar
  83. 83.
    Meinshausen, N., Bhlmann, P.: Stability selection. JRSSB 72, 417–448 (2010)MathSciNetCrossRefGoogle Scholar
  84. 84.
    Menozzi, P., Piazza, A., Cavalli-Sforza, L.: Synthetic maps of human gene frequencies in Europeans. Science 201, 786–792 (1978)CrossRefGoogle Scholar
  85. 85.
    Miller, D.J., Zhang, Y., Yu, G.: An algorithm for learning maximum entropy probability models of disease risk that efficiently searches and sparingly encodes multilocus genomic interactions. Bioinformatics 25(19), 2478–2485 (2009)CrossRefGoogle Scholar
  86. 86.
    Moore, J.H.: The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum. Hered. 56, 73–82 (2003)CrossRefGoogle Scholar
  87. 87.
    Moore, J.H., Gilbert, J.C., Tsai, C.T., Chiang, F.T., Holden, T., Barney, N., White, B.C.: A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. J. Theor. Biol. 241(2), 252–261 (2006)MathSciNetCrossRefGoogle Scholar
  88. 88.
    Morgenthaler, S., Thilly, W.G.: A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (CAST). Mutat. Res. 615(1–2), 28–56 (2007)CrossRefGoogle Scholar
  89. 89.
    National Center for Biotechnology Information, United States National Library of Medicine. NCBI dbSNP build 144 for human. Summary Page. Accessed 26 Aug 2015
  90. 90.
    Nelson, M.R., et al.: The population reference sample, POPRES: a resource for population, disease, and pharmacological genetics research. Am. J. Hum. Genet. 83, 347–358 (2008)CrossRefGoogle Scholar
  91. 91.
    Ouwehand, W.H.: The discovery of genes implicated in myocardial infarction. J. Thromb. Haemost. 7(Suppl 1), 305–307 (2009)CrossRefGoogle Scholar
  92. 92.
    Park, T., Casella, G.: The Bayesian Lasso. JASA 103, 681–686 (2008)MathSciNetzbMATHCrossRefGoogle Scholar
  93. 93.
    Pattin, K.A., White, B.C., Barney, N., et al.: A computationally efficient hypothesis testing method for epistasis analysis using multifactor dimensionality reduction. Genet. Epidemi. 33(1), 87–94 (2009)CrossRefGoogle Scholar
  94. 94.
    Pierce, J.R.: An Introduction to Information Theory: Symbols, Signals, and Noise. Dover, New York (1980)Google Scholar
  95. 95.
    Potkin, S.G., Turner, J.A., Guffanti, G., Lakatos, A., Torri, F., Keator, D.B., Macciardi, F.: Genome-wide strategies for discovering genetic influences on cognition and cognitive disorders: methodological considerations. Cogn. Neuropsychiatry 14(4/5), 391–418 (2009)CrossRefGoogle Scholar
  96. 96.
    Pritchard, J.K., Rosenberg, N.A.: Use of unlinked genetic markers to detect population stratification in association studies. Am. J. Hum. Genet. 65, 220–228 (1999)CrossRefGoogle Scholar
  97. 97.
    Pritchard, J., Stephens, M., Donnelly, P.: Inference of population structure using multilocus genotype data. Genetics 155(2), 945 (2000)Google Scholar
  98. 98.
    Price, A.L., et al.: Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006)CrossRefGoogle Scholar
  99. 99.
    Price, A.L., Patterson, N., Yu, F., et al.: A genomewide admixture map for Latino populations. Am. J. Hum. Genet. 80(6), 1024–1036 (2007)CrossRefGoogle Scholar
  100. 100.
    Price, A.L., Tandon, A., Patterson, N., Barnes, K.C., Rafaels, N., Ruczinski, I., Beatty, T.H., Mathias, R., Reich, D., Myers, S.: Sensitive detection of chromosomal segments of distinct ancestry in admixed populations. PLoS Genet. 5(6), e1000519 (2009)CrossRefGoogle Scholar
  101. 101.
    Price, A.L., Zaitlen, N.A., Reich, D., Patterson, N.: New approaches to population stratification in genome-wide association studies. Nat. Rev. Genet. 11(7), 459–463 (2010)CrossRefGoogle Scholar
  102. 102.
    Purcell, S., Neale, B., Todd-Brown, K., et al.: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81(3), 559–575 (2007)CrossRefGoogle Scholar
  103. 103.
    Rabbee, N., Speed, T.P.: A genotype calling algorithm for affymetrix SNP arrays. Bioinformatics 22, 7–12 (2006)CrossRefGoogle Scholar
  104. 104.
    Redden, D.T., Divers, J., Vaughan, L.K., et al.: Regional admixture mapping and structured association testing: conceptual unification and an extensible general linear model. PLoS Genet. 2, e137 (2006)CrossRefGoogle Scholar
  105. 105.
    Reich, D.E., Goldstein, D.B.: Detecting association in a case-control study while correcting for population stratification. Genet. Epidemiol. 20, 4–16 (2001)CrossRefGoogle Scholar
  106. 106.
    Ritchie, M.E., Carvalho, B.S., Hetrick, K.N., Tavaré, S., Irizarry, R.A.: R/Bioconductor software for Illumina’s Infinium whole-genome genotyping BeadChips. Bioinformatics 25, 2621–2623 (2009)CrossRefGoogle Scholar
  107. 107.
    Ritchie, M.D., Hahn, L.W., Roodi, N., Bailey, L.R., Dupont, W.D., Parl, F.F., Moore, J.H.: Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am. J. Hum. Genet. 69(1), 138–147 (2001)CrossRefGoogle Scholar
  108. 108.
    Riveros, C., Vimieiro, R., Holliday, E.G.: Identification of Genome-Wide SNP-SNP and SNP-Clinical Boolean Interactions in Age-Related Macular Degeneration In Epistasis, 217–255. Springer, New York (2015)Google Scholar
  109. 109.
    Robertson, T., Wright, F.T., Dykstra, R.L.: Order Restricted Statistical Inference. Wiley, New York (1988)Google Scholar
  110. 110.
    Nature Genetics Genome-wide association analysis of metabolic traits in a birth cohort from a founder population. 41(1), 35–46 (2009)Google Scholar
  111. 111.
    Sampson, J.N., Zhao, H.: Genotyping and inflated type I error rate in genome-wide association case/control studies. BMC Bioinform. 10, 68 (2009)Google Scholar
  112. 112.
    Sasieni, P.D.: From genotypes to genes: doubling the sample size. Biometrics 53, 1253–1261 (1997)MathSciNetzbMATHCrossRefGoogle Scholar
  113. 113.
    Scheet, P., Stephens, M.: A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet. 78, 629–644 (2006)CrossRefGoogle Scholar
  114. 114.
    Schwender, H., Ickstadt, K.: Identification of SNP interactions using logic regression. Biostatistics 9(1), 187–198 (2008)zbMATHCrossRefGoogle Scholar
  115. 115.
    Schwender, H., Ruczinski, I., Ickstadt, K.: Testing SNPs and sets of SNPs for importance in association studies. Biostatistics (2010). doi: 10.1093/biostatistics/kxq042
  116. 116.
    Segura, V., Vilhjalmsson, B.J., Platt, A., Korte, A., Seren, Ü., Long, Q., Nordborg, M.: An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations. Nat. Genet. 44(7), 825–830 (2012)CrossRefGoogle Scholar
  117. 117.
    Setakis, E., Stirnadel, H., Balding, D.J.: Logistic regression protects against population structure in genetic association studies. Genome Res. 16, 290–296 (2006)CrossRefGoogle Scholar
  118. 118.
    Spielman, R.S., McGinnis, R.E., Ewens, W.J.: Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am. J. Hum. Genet. 52(3), 506–516 (1993)Google Scholar
  119. 119.
    Stranger, B.E., Nica, A.C., Forrest, M.S., Dimas, A., Bird, C.P., Beazley, C., Ingle, C.E., Dunning, M., Flicek, P., Montgomery, S., Tavaré, S., Deloukas, P., Dermitzakis, E.T.: Population genomics of human gene expression. Nat. Genet. 39, 1217–1224 (2007)CrossRefGoogle Scholar
  120. 120.
    Szulc, P., Bogdan, M., Frommlet, F., Tang H.: Joint Genotype- and Ancestry-based Genome-wide Association Studies in Admixed Populations. Working Paper (2015)Google Scholar
  121. 121.
    Tang, H., Siegmund, D.O., Johnson, N.A., Romieu, I., London, S.J.: Joint testing of genotype and ancestry association in admixed families. Genet. Epidemiol. 34(8), 783–791 (2010)CrossRefGoogle Scholar
  122. 122.
    Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. B 58(1), 267–288 (1996)MathSciNetzbMATHGoogle Scholar
  123. 123.
    Via, M., Gignoux, C., Burchard, E.G.: The 1000 genomes project: new opportunities for research and social challenges. Genome Med. 2, 3 (2010)CrossRefGoogle Scholar
  124. 124.
    Wei, Z., Sun, W., Wang, K., Hakonarson, H.: Multiple testing in genome-wide association studies via hidden Markov models. Bioinformatics 25(21), 2802–2808 (2009)CrossRefGoogle Scholar
  125. 125.
    Wolf, B.J., Hill, E.G., Slate, E.H.: Logic forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics 26(17), 2183–2189 (2010)CrossRefGoogle Scholar
  126. 126.
    Wu, T.T., Chen, Y.F., Hastie, T., Sobel, E., Lange, K.: Genome-wide association analysis by lasso penalized logistic regression. Bioinformatics 25(6), 714–721 (2009)Google Scholar
  127. 127.
    Yang, C., He, Z., Wan, X., Yang, Q., Xue, H., Yu, W.: SNPHarvester: a filtering-based approach for detecting epistatic interactions in genome-wide association studies. Bioinformatics 25(4), 504–511 (2009)CrossRefGoogle Scholar
  128. 128.
    Yang, J., et al.: Common SNPs explain a large proportion of heritability for human height. Nat. Genet. 42, 565–569 (2010)CrossRefGoogle Scholar
  129. 129.
    Yu, J., Pressoir, G., Briggs, W.H., Vroh Bi, I., Yamasaki, M., Doebley, J.F., McMullen, M.D., Gaut, B.S., Nielsen, D.M., Holland, J.B., Kresovich, S., Buckler, E.S.: A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38(2), 203–208 (2006)Google Scholar
  130. 130.
    Żak-Szatkowska, M., Bogdan, M.: Modified versions of Bayesian information criterion for sparse generalized linear models. CSDA. In Press, Accepted Manuscript (2012)Google Scholar
  131. 131.
    Zehetmayer, S., Posch, M.: False discovery rate control in two-stage designs. BMC Bioinform. 613, 81 (2012). doi: 10.1186/1471-2105-13-81
  132. 132.
    Zhang, Y., Liu, J.S.: Bayesian inference of epistatic interactions in case-control studies. Nat. Genet. 39(9), 1167–1173 (2007)CrossRefGoogle Scholar
  133. 133.
    Zhao, J., Chen, Z.: A two-stage penalized logistic regression approach to case-control genome-wide association studies. (2010)
  134. 134.
    Ziegler, A., König, I.R., Thompson, J.R.: Biostatistical aspects of genome-wide association studies. Biometrical J. 50(1), 8–28 (2008)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2016

Authors and Affiliations

  • Florian Frommlet
    • 1
    Email author
  • Małgorzata Bogdan
    • 2
  • David Ramsey
    • 3
  1. 1.Center for Medical Statistics, Informatics, and Intelligent Systems Section for Medical StatisticsMedical University of ViennaViennaAustria
  2. 2.Institute of MathematicsUniversity of WrocławWrocławPoland
  3. 3.Department of Operations ResearchWrocław University of TechnologyWrocławPoland

Personalised recommendations