Abstract
In genetic epidemiology, epistasis has been the subject of several researchers to understand the underlying causes of complex diseases. Identifying gene-gene and/or gene-environmental interactions are becoming more challenging due to multiple genetic and environmental factors acting together or independently. The limitations of current computational approaches motivated the development of a deep learning method in our recent study. The approach trained a multilayered feedforward neural network to discover interacting genes associated with complex diseases. The models are evaluated under various simulated scenarios and compared with the previous methods. The results showed significant improvements in predicting gene interactions over the traditional machine learning techniques. This study is further extended to maximize the predictive performance of the method by tuning the hyperparameters using Cartesian grid and random grid searching. Several experiments are conducted on real datasets to identify higher-order interacting genes responsible for diseases. The findings demonstrated randomly chosen trials are more efficient than trials chosen by grid search for optimizing hyperparameters. The optimal configuration of hyperparameter values improved the model performance without overfitting. The results illustrate top 30 gene interactions responsible for sporadic breast cancer and hypertension.
References
Padyukov, L.: Between the Lines of Genetic Code: Genetic Interactions in Understanding Disease and Complex Phenotypes. Academic Press, Cambridge (2013)
Gusareva, E.S., et al.: Genome-wide association interaction analysis for Alzheimer’s disease. Neurobiol. Aging 35(11), 2436–2443 (2014)
Cordell, H.J.: Detecting gene-gene interactions that underlie human diseases. Nat. Rev. Genet. 10(6), 392–404 (2009)
Uppu, S., Krishna, A., Gopalan, R.: A review on methods for detecting SNP interactions in high-dimensional genomic data. IEEE/ACM Trans. Comput. Biol. Bioinf. PP(99) (2016). doi:10.1109/TCBB.2016.2635125
Ritchie, M.D., et al.: Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am. J. Hum. Genet. 69(1), 138–147 (2001)
Calle, M.L., et al.: MB-MDR: model-based multifactor dimensionality reduction for detecting interactions in high-dimensional genomic data. Stat. Med. 27(30), 6532–6546 (2008)
Schwarz, D.F., König, I.R., Ziegler, A.: On safari to random jungle: a fast implementation of random forests for high-dimensional data. Bioinformatics 26(14), 1752–1758 (2010)
Yang, C., et al.: SNPHarvester: a filtering-based approach for detecting epistatic interactions in genome-wide association studies. Bioinformatics 25(4), 504–511 (2009)
Wan, X., et al.: BOOST: a fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am. J. Hum. Genet. 87(3), 325–340 (2010)
Purcell, S., et al.: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81(3), 559–575 (2007)
Schwender, H., Ickstadt, K.: Identification of SNP interactions using logic regression. Biostatistics 9(1), 187–198 (2008)
Zhang, Y., Liu, J.S.: Bayesian inference of epistatic interactions in case-control studies. Nat. Genet. 39(9), 1167–1173 (2007)
Marvel, S., Motsinger-Reif, A.: Grammatical evolution support vector machines for predicting human genetic disease association. In: Proceedings of the 14th annual conference companion on Genetic and evolutionary computation. ACM (2012)
Motsinger, A.A., et al.: GPNN: Power studies and applications of a neural network method for detecting gene-gene interactions in studies of human disease. BMC Bioinformatics 7(1), 39 (2006)
Bengio, Y., Goodfellow, I.J., Courville, A.: Deep Learning. An MIT Press book in preparation. Draft chapters available at http://www.iro.umontreal.ca/∼bengioy/dlbook (2015)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Uppu, S., Krishna, A.: Improving strategy for discovering interacting genetic variants in association studies. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds.) ICONIP 2016. LNCS, vol. 9947, pp. 461–469. Springer, Cham (2016). doi:10.1007/978-3-319-46687-3_51
Uppu, S., Krishna, A., Raj, P.G.: A deep learning approach to detect SNP interactions. J. Softw. 11(10), 960–975 (2016)
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)
Chiang, F.-T., et al.: Molecular variant M235T of the angiotensinogen gene is associated with essential hypertension in Taiwanese. J. Hypertens. 15(6), 607–611 (1997)
Wu, S.-J., et al.: Three single-nucleotide polymorphisms of the angiotensinogen gene and susceptibility to hypertension: single locus genotype vs. haplotype analysis. Physiol. Genomics 17(2), 79–86 (2004)
Aiello, S., Kraljevic, T., Maj, P.: h2o: R Interface for H2O. R package version, vol. 3 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Uppu, S., Krishna, A. (2017). Tuning Hyperparameters for Gene Interaction Models in Genome-Wide Association Studies. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10638. Springer, Cham. https://doi.org/10.1007/978-3-319-70139-4_80
Download citation
DOI: https://doi.org/10.1007/978-3-319-70139-4_80
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70138-7
Online ISBN: 978-3-319-70139-4
eBook Packages: Computer ScienceComputer Science (R0)