Abstract
A major milestone in modern biology was the complete sequencing of the human genome. But it produced a whole set of new challenges in exploring the functions and interactions of different parts of the genome. One application is predicting disorders based on mining the genotype and understanding how the interactions between genetic loci lead to certain human diseases.
However, typically disease phenotypes are genetically complex. They are characterized by large, high-dimensional data sets. Also, usually the sample size is small.
Recently machine learning and predictive modeling approaches have been successfully applied to understand the genotype-phenotype relations and link them to human diseases. They are well suited to overcome the problems of the large data sets produced by the human genome and its high-dimensionality. Machine learning techniques have been applied in virtually all data mining domains and have proven to be effective in BioData mining as well.
This paper describes some of the techniques that have been adopted in recent studies in human genome analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Okser, S., Pahikkala, T., Aittokallio, T.: Genetic variants and their interactions in disease risk prediction - machine learning and network perspectives. BioData Mining 6(1), 5 (2013)
González-Recio, O., Rosa, G.J.M., Gianola, D.: Machine learning methods and predictive ability metrics for genome-wide prediction of complex traits. Livestock Science 166, 217–231 (2014)
Yip, K., Cheng, C., Gerstein, M.: Machine learning and genome annotation: a match meant to be? Genome Biology 14(5), 205 (2013)
Patel, M., et al.: An Introduction to Back Propagation Learning and its Application in Classification of Genome Data Sequence. In: Babu, B.V., et al. (eds.) Proceedings of the Second International Conference on Soft Computing for Problem Solving (SocProS 2012), December 28-30, 2012, pp. 609–615. Springer India (2014)
Vanneschi, L., et al.: A comparison of machine learning techniques for survival prediction in breast cancer. BioData Mining 4(1), 12 (2011)
Capriotti, E., Altman, R.B.: A new disease-specific machine learning approach for the prediction of cancer-causing missense variants. Genomics 98(4), 310–317 (2011)
Menden, M.P., et al.: Machine Learning Prediction of Cancer Cell Sensitivity to Drugs Based on Genomic and Chemical Properties. PLoS ONE 8(4), 1–7 (2013)
Guo, P., et al.: Mining gene expression data of multiple sclerosis. PloS one 9(6), e100052 (2014)
Granados, E.A.O., et al. Characterizing genetic interactions using a machine learning approach in Colombian patients with Alzheimer’s disease. in Bioinformatics and Biomedicine (BIBM). In: 2013 IEEE International Conference on. (2013)
Scheubert, L., et al.: Tissue-based Alzheimer gene expression markers-comparison of multiple machine learning approaches and investigation of redundancy in small biomarker sets. BMC Bioinformatics 13(1), 266 (2012)
Ban, H.-J., et al.: Identification of Type 2 Diabetes-associated combination of SNPs using Support Vector Machine. BMC Genetics 11(1), 26 (2010)
Burstein, D., et al.: Genome-Scale Identification of Legionella pneumophila Effectors Using a Machine Learning Approach. PLoS Pathogens 5(7), 1–12 (2009)
Tretyakov, K.: Machine Learning Techniques in Spam Filtering, in Data Mining Problem-oriented Seminar, U.o.T. Institute of Computer Science, Editor. Estonia. p. 19 (2004)
Witten, I.H., Frank, E., Hall, M.A.: Data Mining, 3rd edn. Elsevier, Burlington, MA (2011)
Kotsiantis, S.B.: Supervised Machine Learning. Informatica 31, 19 (2007)
Larrañaga, P., et al.: Machine learning in bioinformatics. Briefings in Bioinformatics 7(1), 86–112 (2006)
Jauhari, S., Rizvi, S.A.M.: Mining Gene Expression Data Focusing Cancer Therapeutics: A Digest. Computational Biology and Bioinformatics, IEEE/ACM Transactions on 11(3), 533–547 (2014)
Tong, D.L., et al.: Artificial Neural Network Inference (ANNI): A Study on Gene-Gene Interaction for Biomarkers in Childhood Sarcomas. PLoS ONE 9(7), 1–13 (2014)
Gunther, F., Pigeot, I., Bammann, K.: Artificial neural networks modeling gene-environment interaction. BMC Genetics 13(1), 37 (2012)
Abo-Zahhad, M., et al.: Integrated Model of DNA Sequence Numerical Representation and Artificial Neural Network for Human Donor and Acceptor Sites Prediction. International journal of information technology and computer science 6(8), 51–57 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Wlodarczak, P., Soar, J., Ally, M. (2015). Genome Mining Using Machine Learning Techniques. In: Geissbühler, A., Demongeot, J., Mokhtari, M., Abdulrazak, B., Aloulou, H. (eds) Inclusive Smart Cities and e-Health. ICOST 2015. Lecture Notes in Computer Science(), vol 9102. Springer, Cham. https://doi.org/10.1007/978-3-319-19312-0_39
Download citation
DOI: https://doi.org/10.1007/978-3-319-19312-0_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19311-3
Online ISBN: 978-3-319-19312-0
eBook Packages: Computer ScienceComputer Science (R0)