Abstract
In this chapter we consider some key elements in conducting a successful genome-wide association study or GWAS. The first step is to design the study well (Subheading 3.1), paying particular attention to case and control selection and achieving adequate sample size to deal with the large burden of multiple testing. Second, we focus on the crucial step of applying stringent quality control (Subheading 3.2) to genotyping results. The most crucial potential confounding factor in GWAS is population stratification, and we describe methods for accounting for this in study design and analysis (Subheading 3.3). The primary association analysis is relatively straightforward, and we describe the main approaches to this, including evaluation of results (Subheading 3.4). More comprehensive coverage of the genome can be achieved by using an external reference panel to estimate genotypes at untyped variants using imputation (Subheading 3.5), which we consider in some detail. We finish with some observations on following up a GWAS (Subheading 3.6).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Abbreviations
- GWAS:
-
Genome-wide association study
- HWE:
-
Hardy–Weinberg equilibrium
- LD:
-
Linkage disequilibrium
- PCA:
-
Principal component analysis
- QC:
-
Quality control
- SNP:
-
Single-nucleotide polymorphism
References
Hindorff LA, MacArthur J, Morales J et al (2013) A catalog of published genome-wide association studies. www.genome.gov/GWASstudies. Accessed 11 Apr 2014
Antoniou AC, Easton DF (2003) Polygenic inheritance of breast cancer: implications for design of association studies. Genet Epidemiol 25:190–202
Xiao R, Boehnke M (2009) Quantifying and correcting for the winner’s curse in genetic association studies. Genet Epidemiol 33:453–462
Astle W, Balding DJ (2009) Population structure and cryptic relatedness in genetic association studies. Stat Sci 24:451–471
Devlin B, Roeder K (1999) Genomic control for association studies. Biometrics 55:997–1004
Devlin B, Roeder K (2001) Genomic control: a new approach to genetic-based association studies. Theor Popul Biol 60:155–166
Price AL, Patterson NJ, Plenge RM et al (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38:904–909
Wellcome Trust Case Control Consortium (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447:661–678
Novembre J, Stephens M (2008) Interpreting principal component analyses of spatial population genetic variation. Nat Genet 40:646–649
Novembre J, Johnson T, Bryc K et al (2008) Genes mirror geography within Europe. Nature 456:98–101
Bishop DT, Demenais F, Iles MM et al (2009) Genome-wide association study identifies three loci associated with melanoma risk. Nat Genet 41:920–925
Kang HM, Sul JH, Service SK et al (2010) Variance component model to account for sample structure in genome-wide association studies. Nat Genet 42:348–354
Zhang Z, Ersoz E, Lai CQ et al (2010) Mixed linear model approach adapted for genome-wide association studies. Nat Genet 42:355–360
Mathieson I, McVean G (2012) Differential confounding of rare and common variants in spatially structured populations. Nat Genet 44:243–246
Iles MM (2010) The impact of incomplete linkage disequilibrium and genetic model choice on the analysis and interpretation of genome-wide association studies. Ann Hum Genet 74:375–379
Marchini J, Howie B, Myers S et al (2007) A new multipoint method for genome-wide association studies via imputation of genotypes. Nat Genet 39:906–913
Li Y, Willer CJ, Ding J et al (2010) MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet Epidemiol 34:816–834
Browning SR (2008) Missing data imputation and haplotype phase inference for genome-wide association studies. Hum Genet 124:439–450
Wallace C, Chapman JM, Clayton DG (2006) Improved power offered by a score test for linkage disequilibrium mapping of quantitative-trait loci by selective genotyping. Am J Hum Genet 78:498–504
Purcell S, Neale B, Todd-Brown K et al (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81:559–575
Iles MM, Law MH, Stacey SN et al (2013) A variant in FTO shows association with melanoma risk not due to BMI. Nat Genet 45:428–432
Pruim RJ, Welch RP, Sanna S et al (2010) LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 26:2336–2337
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this protocol
Cite this protocol
Barrett, J.H., Taylor, J.C., Iles, M.M. (2014). Statistical Perspectives for Genome-Wide Association Studies (GWAS). In: Trent, R. (eds) Clinical Bioinformatics. Methods in Molecular Biology, vol 1168. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-0847-9_4
Download citation
DOI: https://doi.org/10.1007/978-1-4939-0847-9_4
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-0846-2
Online ISBN: 978-1-4939-0847-9
eBook Packages: Springer Protocols