# Bayesian model averaging for evaluation of candidate gene effects

## Abstract

Statistical assessment of candidate gene effects can be viewed as a problem of variable selection and model comparison. Given a certain number of genes to be considered, many possible models may fit to the data well, each including a specific set of gene effects and possibly their interactions. The question arises as to which of these models is most plausible. Inference about candidate gene effects based on a specific model ignores uncertainty about model choice. Here, a Bayesian model averaging approach is proposed for evaluation of candidate gene effects. The method is implemented through simultaneous sampling of multiple models. By averaging over a set of competing models, the Bayesian model averaging approach incorporates model uncertainty into inferences about candidate gene effects. Features of the method are demonstrated using a simulated data set with ten candidate genes under consideration.

## Keywords

Bayes factor Bayesian model averaging Candidate genes Linear models Markov chain Monte Carlo Quantitative traits## Notes

### Acknowledgments

This research was supported by the Wisconsin Agriculture Experiment Station, and was partially supported by National Research Initiative Grant no. 2009-35205-05099 from the USDA Cooperative State Research, Education, and Extension Service, NSF DEB-0089742, and NDF DMS-044371. KAW acknowledges financial support from the National Association of Animal Breeders (Columbia, MO). Comments from the anonymous reviewers and the editor are acknowledged.

## References

- Bishop CM (2006) Pattern recognition and machine learning. Springer, New YorkCrossRefGoogle Scholar
- Carlin B, Chib S (1995) Bayesian model choice via Markov Chain Monte Carlo methods. J Roy Stat Soc Ser B 57:473–484Google Scholar
- Carlin BP, Louis TA (1995) Bayes and empirical Bayes methods for data analysis, 2nd edn. Chapman & Hall/CRC Press, Boca RatonGoogle Scholar
- Congdon P (2006) Bayesian model choice based on Monte Carlo estimates of posterior model probabilities. Comput Stat Data Anal 50:346–357CrossRefGoogle Scholar
- Congdon P (2007) Model weights for model choice and averaging. Stat Methodol 4:143–157CrossRefGoogle Scholar
- Dellaportas P, Forster J, Ntzoufras I (2002) On Bayesian model and variable selection using MCMC. Stat Comput 12:27–36CrossRefGoogle Scholar
- Draper D (1995) Assessment and propagation of model uncertainty. J Roy Stat Soc Ser B 57:45–97Google Scholar
- Fridley B (2009) Bayesian variable and model selection methods for genetic association studies. Genet Epidemiol 33:27–37CrossRefPubMedGoogle Scholar
- Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7:457–511CrossRefGoogle Scholar
- Green P (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82:711–732CrossRefGoogle Scholar
- Heath SC (1997) Markov chain Monte Carlo segregation and linkage analysis for oligogenic models. Am J Hum Genet 61:748–760CrossRefPubMedGoogle Scholar
- Hoeting JA, Madigan D, Raftery AE, Volinsky T (1999) Bayesian model averaging: a tutorial. Stat Sci 14:382–417CrossRefGoogle Scholar
- Jannink JL, Wu XL (2003) Estimating allelic number and identity in state of QTLs in interconnected families. Genet Res 81:133–144CrossRefPubMedGoogle Scholar
- Madigan D, Raftery AE (1994) Model selection and accounting for model uncertainty in graphical models using Occam’s window. J Am Stat Assoc 89:1535–1546CrossRefGoogle Scholar
- Miller AJ (1984) Selection of subsets of regression (with discussion). J Roy Stat Soc Ser A 147:387–425Google Scholar
- Munafò MR (2006) Candidate gene studies in the 21st century: meta-analysis, mediation, moderation. Genes Brain Behav 5(Suppl 1):3–8CrossRefPubMedGoogle Scholar
- Pflieger S, Lefebvre V, Causse M (1996) The candidate gene approach in plant genetics: a review. Mol Breed 7:275–291CrossRefGoogle Scholar
- Raftery AE (1993) Bayesian model selection in structural equation models. In: Bollen K, Long J (eds) Testing structural equation models. Sage, Newbury Park, pp 163–180Google Scholar
- Raftery AE, Madigan D, Volinsky CT (1996) Accounting for model uncertainty in survival analysis improves predictive performance (with discussion). In: Bernardo J, Berger J, Dawid A, Smith A (eds) Bayesian statistics 5. Oxford University Press, Oxford, pp 323–349Google Scholar
- Regal RR, Hook EB (1991) The effect of model selection on confidential intervals for size of a closed population. Stat Med 10:717–721CrossRefPubMedGoogle Scholar
- Rothschild MF (2003) Advances in pig genomics and functional gene discovery. Comp Funct Genomics 4:266–270CrossRefPubMedGoogle Scholar
- Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464CrossRefGoogle Scholar
- Sillanpää MJ, Arjas E (1998) Bayesian mapping of multiple quantitative trait loci from incomplete inbred line cross data. Genetics 148:1373–1388PubMedGoogle Scholar
- Sillanpää MJ, Arjas E (1999) Bayesian mapping of multiple quantitative trait loci from incomplete outbred offspring data. Genetics 151:1605–1619PubMedGoogle Scholar
- Sinharay S, Stein HS (2005) An empirical comparison of methods for computing Bayes factors in generalized linear mixed models. J Comput Graph Stat 14:415–435CrossRefGoogle Scholar
- Sorensen D, Gianola D (2002) Likelihood, Bayesian, and MCMC methods in quantitative genetics. Springer, New YorkGoogle Scholar
- Tierney L, Kadane JB (1986) Accurate approximations for posterior moments and marginal densities. J Am Stat Assoc 81:82–86CrossRefGoogle Scholar
- Uimari P, Hoeschele I (1997) Mapping-linked quantitative trait loci using Bayesian analysis and Markov chain Monte Carlo algorithms. Genetics 146:735–743PubMedGoogle Scholar
- Wu XL, Jannink JL (2004) Optimal sampling of a population to determine QTL location, variance, and allelic number. Theor Appl Genet 108:1434–1442CrossRefPubMedGoogle Scholar
- Wu XL, Macneil MD, De S, Xiao QJ, Michal JJ, Gaskins CT, Reeves JJ, Busboom JR, Wright RW Jr, Jiang Z (2005) Evaluation of candidate gene effects for beef backfat via Bayesian model selection. Genetica 125:103–113CrossRefPubMedGoogle Scholar
- Yi N, Xu S (2000a) Bayesian mapping of quantitative trait loci for complex binary traits. Genetics 155:1391–1403PubMedGoogle Scholar
- Yi N, Xu S (2000b) Bayesian mapping of quantitative trait loci under the identity-by-descent-based variance component model. Genetics 156:411–422PubMedGoogle Scholar