Improving the Reproducibility of Genetic Association Results Using Genotype Resampling Methods
Replication may be an inadequate gold standard for substantiating the significance of results from genome-wide association studies (GWAS). Successful replication provides evidence supporting true results and against spurious findings, but various population attributes contribute to observed significance of a genetic effect. We hypothesize that failure to replicate an interaction observed to be significant in a GWAS of one population in a second population is sometimes attributable to differences in minor allele frequencies, and resampling the replication dataset by genotype to match the minor allele frequencies of the discovery data can improve estimates of the interaction significance. We show via simulation that resampling of the replication data produced results more concordant with the discovery findings. We recommend that failure to replicate GWAS results should not immediately be considered to refute previously-observed findings and conversely that replication does not guarantee significance, and suggest that datasets be compared more critically in biological context.
KeywordsGWAS SNPs Epistasis Complex diseases Reproducibility
This work was supported by National Institutes of Health grants LM009012, and AI116794.
- 3.Patil, P., Peng, R.D., Leek, J.: A statistical definition for reproducibility and replicability. bioRxiv, 066803, 1 January 2016Google Scholar
- 4.Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
- 11.Yang, J., Ferreira, T., Morris, A.P., Medland, S.E., Madden, P.A., Heath, A.C., Martin, N.G., Montgomery, G.W., Weedon, M.N., Loos, R.J., Frayling, T.M.: Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 44(4), 369–375 (2012)CrossRefGoogle Scholar