Skip to main content

Genome Mining Using Machine Learning Techniques

  • Conference paper
  • First Online:
Inclusive Smart Cities and e-Health (ICOST 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9102))

Included in the following conference series:

Abstract

A major milestone in modern biology was the complete sequencing of the human genome. But it produced a whole set of new challenges in exploring the functions and interactions of different parts of the genome. One application is predicting disorders based on mining the genotype and understanding how the interactions between genetic loci lead to certain human diseases.

However, typically disease phenotypes are genetically complex. They are characterized by large, high-dimensional data sets. Also, usually the sample size is small.

Recently machine learning and predictive modeling approaches have been successfully applied to understand the genotype-phenotype relations and link them to human diseases. They are well suited to overcome the problems of the large data sets produced by the human genome and its high-dimensionality. Machine learning techniques have been applied in virtually all data mining domains and have proven to be effective in BioData mining as well.

This paper describes some of the techniques that have been adopted in recent studies in human genome analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Okser, S., Pahikkala, T., Aittokallio, T.: Genetic variants and their interactions in disease risk prediction - machine learning and network perspectives. BioData Mining 6(1), 5 (2013)

    Article  Google Scholar 

  2. González-Recio, O., Rosa, G.J.M., Gianola, D.: Machine learning methods and predictive ability metrics for genome-wide prediction of complex traits. Livestock Science 166, 217–231 (2014)

    Article  Google Scholar 

  3. Yip, K., Cheng, C., Gerstein, M.: Machine learning and genome annotation: a match meant to be? Genome Biology 14(5), 205 (2013)

    Article  Google Scholar 

  4. Patel, M., et al.: An Introduction to Back Propagation Learning and its Application in Classification of Genome Data Sequence. In: Babu, B.V., et al. (eds.) Proceedings of the Second International Conference on Soft Computing for Problem Solving (SocProS 2012), December 28-30, 2012, pp. 609–615. Springer India (2014)

    Google Scholar 

  5. Vanneschi, L., et al.: A comparison of machine learning techniques for survival prediction in breast cancer. BioData Mining 4(1), 12 (2011)

    Article  Google Scholar 

  6. Capriotti, E., Altman, R.B.: A new disease-specific machine learning approach for the prediction of cancer-causing missense variants. Genomics 98(4), 310–317 (2011)

    Article  Google Scholar 

  7. Menden, M.P., et al.: Machine Learning Prediction of Cancer Cell Sensitivity to Drugs Based on Genomic and Chemical Properties. PLoS ONE 8(4), 1–7 (2013)

    Article  Google Scholar 

  8. Guo, P., et al.: Mining gene expression data of multiple sclerosis. PloS one 9(6), e100052 (2014)

    Article  Google Scholar 

  9. Granados, E.A.O., et al. Characterizing genetic interactions using a machine learning approach in Colombian patients with Alzheimer’s disease. in Bioinformatics and Biomedicine (BIBM). In: 2013 IEEE International Conference on. (2013)

    Google Scholar 

  10. Scheubert, L., et al.: Tissue-based Alzheimer gene expression markers-comparison of multiple machine learning approaches and investigation of redundancy in small biomarker sets. BMC Bioinformatics 13(1), 266 (2012)

    Article  Google Scholar 

  11. Ban, H.-J., et al.: Identification of Type 2 Diabetes-associated combination of SNPs using Support Vector Machine. BMC Genetics 11(1), 26 (2010)

    Article  MathSciNet  Google Scholar 

  12. Burstein, D., et al.: Genome-Scale Identification of Legionella pneumophila Effectors Using a Machine Learning Approach. PLoS Pathogens 5(7), 1–12 (2009)

    Article  Google Scholar 

  13. Tretyakov, K.: Machine Learning Techniques in Spam Filtering, in Data Mining Problem-oriented Seminar, U.o.T. Institute of Computer Science, Editor. Estonia. p. 19 (2004)

    Google Scholar 

  14. Witten, I.H., Frank, E., Hall, M.A.: Data Mining, 3rd edn. Elsevier, Burlington, MA (2011)

    Google Scholar 

  15. Kotsiantis, S.B.: Supervised Machine Learning. Informatica 31, 19 (2007)

    MathSciNet  Google Scholar 

  16. Larrañaga, P., et al.: Machine learning in bioinformatics. Briefings in Bioinformatics 7(1), 86–112 (2006)

    Article  Google Scholar 

  17. Jauhari, S., Rizvi, S.A.M.: Mining Gene Expression Data Focusing Cancer Therapeutics: A Digest. Computational Biology and Bioinformatics, IEEE/ACM Transactions on 11(3), 533–547 (2014)

    Article  Google Scholar 

  18. Tong, D.L., et al.: Artificial Neural Network Inference (ANNI): A Study on Gene-Gene Interaction for Biomarkers in Childhood Sarcomas. PLoS ONE 9(7), 1–13 (2014)

    Article  Google Scholar 

  19. Gunther, F., Pigeot, I., Bammann, K.: Artificial neural networks modeling gene-environment interaction. BMC Genetics 13(1), 37 (2012)

    Article  Google Scholar 

  20. Abo-Zahhad, M., et al.: Integrated Model of DNA Sequence Numerical Representation and Artificial Neural Network for Human Donor and Acceptor Sites Prediction. International journal of information technology and computer science 6(8), 51–57 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Wlodarczak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Wlodarczak, P., Soar, J., Ally, M. (2015). Genome Mining Using Machine Learning Techniques. In: Geissbühler, A., Demongeot, J., Mokhtari, M., Abdulrazak, B., Aloulou, H. (eds) Inclusive Smart Cities and e-Health. ICOST 2015. Lecture Notes in Computer Science(), vol 9102. Springer, Cham. https://doi.org/10.1007/978-3-319-19312-0_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19312-0_39

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19311-3

  • Online ISBN: 978-3-319-19312-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics