Abstract
Family-based association analysis unconditional on parental genotypes models the effects of observed genotypes. This approach has been shown to have greater power than conditional methods. In this chapter, we review popular association analysis methods accounting for familial correlations: the marginal model using generalized estimating equations (GEE), the mixed model with a polygenic random component, and genome-wide association analyses. The marginal approach does not explicitly model familial correlations but uses the information to improve the efficiency of parameter estimates. This model, using GEE, is useful when the correlation structure is not of interest; the correlations are treated as nuisance parameters. In the mixed model, familial correlations are modeled as random effects, e.g., the polygenic inheritance model accounts for correlations originating from shared genomic components within a family. These unconditional methods provide a flexible modeling framework for general pedigree data to accommodate traits with various distributions and many types of covariate effects. Genome-wide association studies usually test more than 10,000 SNPs and thus traditional statistical methods accounting for the familial correlations often suffer from a computational burden. Multiple approaches that have been recently proposed to avoid this computational issue are reviewed. The single-marker analysis procedures are demonstrated using the R package gee and the ASSOC program in the S.A.G.E. package, including how to prepare input data, conduct the analysis, and interpret the output. ASSOC allows models to include random components of additional familial correlations that may be not sufficiently explained by a polygenic effect and addresses nonnormality of response variables by transformation methods. With its ease of use, ASSOC provides a useful tool for association analysis of large pedigree data.
Key Words
This is a preview of subscription content, log in via an institution.
References
Aulchenko YS, de Koning DJ, Haley C (2007) Genomewide rapid association using mixed model and regression: a fast and simple method for genomewide pedigree-based quantitative trait loci association analysis. Genetics 177:577–585
Liang K-Y, Zeger S (1986) Longitudinal data analysis using generalized linear models. Biometrika 73:13–22
Diggle P, Heagerty P, Liang K-Y (2002) Analysis of longitudinal data, 2nd edn. Oxford University Press, New York
Davis CS (2002) Statistical methods for the analysis of repeated measurements. Springer, New York
Zhao L, Prentice R (1990) Correlated binary regression using a quadratic exponential model. Biometrika 77:642–648
Balemia A, Leea A (2009) Comparison of GEE1 and GEE2 estimation applied to clustered logistic regression. J Stat Comput Simul 79:361–378
McLean RA, Sanders WL, Stroup WW (1991) A unified approach to mixed linear models. Am Stat 45:54–64
Fisher R (1918) The correlation between relatives on the supposition of Mendelian inheritance. Trans Roy Soc Edinb 52:399–433
Breslow NE, Clayton DG (1993) Approximate inference in generalized linear mixed models. J Am Stat Assoc 88:9–25
Gray-McGuire C, Bochud M, Goodloe R, Elston RC (2009) Genetic association tests: a method for the joint analysis of family and case-control data. Hum Genomics 4:2–20
Aulchenko YS, Ripke S, Isaacs A, van Duijn CM (2007) GenABEL: an R library for genome-wide association analysis. Bioinformatics 23:1294–1296
McCulloch CE, Neuhaus JM (2011) Prediction of random effects in linear and generalized linear models under model misspecification. Biometrics 67(1):270–279
Carroll RJ, Ruppert D (1984) Power transformation when fitting theoretical models to data. J Am Stat Ass 79:321–328
Carroll RJ, Ruppert D (1988) Transformation and weighting in regression. Chapman and Hall/CRC, London
Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc Ser B 26:211–252
George VT, Elston RC (1987) Testing the association between polymorphic markers and quantitative traits in pedigrees. Genet Epidemiol 4:193–201
George VT, Elston RC (1988) Generalized Modulus Power Transformations. Comm Stat Theory Meth 17:2933–2952
Elston RC, George VT, Severtson F (1992) The Elston-Stewart algorithm for continuous genotypes and environmental factors. Hum Hered 42:16–27
Yang J, Lee SH, Goddard ME, Visscher PM (2011) GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet 88(1):76–82
Kang HM, Zaitlen NA, Wade CM, Kirby A, Heckerman D, Daly MJ, Eskin E (2008) Efficient control of population structure in model organism association mapping. Genetics 178(3):1709–1723
Kang HM, Sul JH, Service, S.K, Zaitlen NA, Kong S, Freimer NB, Sabatti C (2010) Variance component model to account for sample structure in genome-wide association studies. Nat Genet 42(4):348–354
Zhou X, Stephens M (2012) Genome-wide efficient mixed-model analysis for association studies. Nat Genet 44(7):821–824
Zhou X, Stephens M (2014) Efficient multivariate linear mixed model algorithms for genome-wide association studies. Nat Methods 11(4):407–409
McPeek MP, Wu X, Ober C (2004) Best linear unbiased allele-frequency estimation in complex pedigrees. Biometrics 60:359–367
Thornton T, McPeek MS (2007) Case-control association testing with related individuals: a more powerful quasi-likelihood score test. Am J Hum Genet 81(2):321–337
Park S, Lee S, Lee Y, Herold C, Hooli B, Mullin K, Park T, Park C, Bertram L, Lange C, Tanzi R, Won S (2015) Adjusting heterogeneous ascertainment bias for genetic association analysis with extended families. BMC Med Genet 19(16):62
Won S, Lange C (2013) A general framework for robust and efficient association analysis in family-based designs: quantitative and dichotomous phenotypes. Stat Med 32(25):4482–4498
Amin N, van Duijn CM, Aulchenko YS (2007) A genomic background based method for association analysis in related individuals. PLoS One 2:e1274
Yu J, Pressoir G, Briggs WH, Vroh Bi I, Yamasaki M, Doebley JF, McMullen MD, Gaut BS, Nielsen DM, Holland JB, Kresovich S, Buckler ES (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38:203–208
Pfeiffer RM, Pee D, Landi MT (2008) On combining family and case-control studies. Genet Epidemiol 32:638–646
Pan W (2001) Akaike's information criterion in generalized estimating equations. Biometrics 57:120–125
Ritland K (1996) Inferring the genetic basis of inbreeding depression in plants. Genome 39:1–8
Agresti A (2002) Categorical data analysis, 2nd edn. John Wiley and Sons, New York
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media LLC
About this protocol
Cite this protocol
Namkung, J., Won, S. (2017). Single Marker Family-Based Association Analysis Not Conditional on Parental Information. In: Elston, R. (eds) Statistical Human Genetics. Methods in Molecular Biology, vol 1666. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7274-6_20
Download citation
DOI: https://doi.org/10.1007/978-1-4939-7274-6_20
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-7273-9
Online ISBN: 978-1-4939-7274-6
eBook Packages: Springer Protocols