Skip to main content

Machine Learning: What, Why, and How?

  • Chapter
  • First Online:

Abstract

Since the beginning of computers era, an enormous amount of data generated seems to be ever-increasing and could be of great use with efficient learning techniques. Learning from the data to make reliable predictions, discovering new patterns and theories has been the most challenging task for the researchers. Machine learning detects the hidden insights in the data, learns from them, and makes reliable predictions on the unseen data. It is used in a range of applications that include bioinformatics, cheminformatics, marketing, linguistics, email filtering, optical character recognition, and many more. In the present chapter, we have described the types of learning, applications of machine learning, steps to generate machine learning models using various learning algorithms, and validation of generated models. Additionally, step-by-step generation of models using Weka workbench which is a collection of machine learning algorithms and data preprocessing tools has also been discussed.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Alex S, Vishwanathan SVN (2008) Introduction to machine learning. Cambridge University Press, Cambridge

    Google Scholar 

  • Ali J, Khan R, Ahmad N, Maqsood I (2012) Random forests and decision trees. Int J Comput Sci Issues 9(5). JOELib/JOELib2 cheminformatics library

    Google Scholar 

  • Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185

    Google Scholar 

  • Bishop CM (2006) Pattern recognition and machine learning. In: Information science and statistics. Springer, New York

    Google Scholar 

  • Brunelli R, Poggio T (1993) Face recognition: features versus templates. IEEE Trans Pattern Anal Mach Intell 15(10):1042–1052

    Article  Google Scholar 

  • Chemical Computing Group Inc (2015) Molecular operating environment (MOE). 2013.08 edn., Sherbooke St. West, Suite #910, Montreal, QC, Canada

    Google Scholar 

  • Cheng J, Tegge AN, Baldi P (2008) Machine learning methods for protein structure prediction. IEEE Rev Biomed Eng 1:41–49. https://doi.org/10.1109/RBME.2008.2008239

    Article  PubMed  Google Scholar 

  • Christopher B (2006) Pattern recognition and machine learning. In: Information science and statistics. Springer, New York

    Google Scholar 

  • Dang V, Croft WB (2010) Feature selection for document ranking using best first search and coordinate ascent. In: Proceedings of SIGIR workshop on feature generation and selection for information retrieval

    Google Scholar 

  • Daumé H (2012) A course in machine learning. ciml.info

  • Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    Google Scholar 

  • Dervisevic I (2006) Machine learning methods for optical character recognition. pp 1–25

    Google Scholar 

  • Ethem A (2009) Introduction to machine learning. The MIT Press, Cambridge

    Google Scholar 

  • Farahat AK, Ghodsi A, Kamel MS (2011) An efficient greedy method for unsupervised feature selection. In: 11th IEEE international conference on data mining

    Google Scholar 

  • Friedman N, Geiger D, GoldSzmidt M (1997) Bayesian network classifiers. Mach Learn 29:131–163

    Article  Google Scholar 

  • Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    Google Scholar 

  • Hsu C-W, Chang C-C, Lin C-J (2003) A practical guide to support vector classification. National Taiwan University

    Google Scholar 

  • Jamal S, Goyal S, Shanker A, Grover A (2017) Computational screening and exploration of disease-associated genes in Alzheimer’s disease. J Cell Biochem 118(6):1471–1479. https://doi.org/10.1002/jcb.25806

    Article  CAS  PubMed  Google Scholar 

  • Karuppasamy S, Indradevi MR, Rajaram R (2008) Combined feature selection and classification – a novel approach for the categorization of web pages. J Inf Comput Sci 3(2):083–089

    Google Scholar 

  • Kohavi R, Provost F (1998) Glossary of terms. Mach Learn 30:271–274

    Article  Google Scholar 

  • Liu K, Feng J, Young SS (2005) PowerMV: a software environment for molecular viewing, descriptor generation, data analysis and hit evaluation. J Chem Inf Model 45(2):515–522. https://doi.org/10.1021/ci049847v

    Article  CAS  PubMed  Google Scholar 

  • López FG, Torres MG, Batista BM, JAM P, Moreno-Vega JM (2006) Solving feature subset selection problem by a parallel scatter search. Eur J Oper Res 169(2):477–489

    Article  Google Scholar 

  • Mitchell TM (1997) Machine learning. McGraw-Hill Science/Engineering/Math, Maidenhead

    Google Scholar 

  • Mitchell JB (2014) Machine learning methods in chemoinformatics. Wiley Interdiscip Rev Comput Mol Sci 4(5):468–481. https://doi.org/10.1002/wcms.1183

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Mohri M, Rostamizadeh A, Talwalkar A (2012) Foundations of machine learning. The MIT Press, Cambridge (MA)/London

    Google Scholar 

  • Moore CL, Smagala JA, Smith CB, Dawson ED, Cox NJ, Kuchta RD Rowlen KL (2007) Evaluation of MChip with historic subtype H1N1 influenza A viruses, including the 1918 “Spanish Flu” strain. J Clin Microbiol 45 (11):3807-3810. JCM.01089-07 [pii]https://doi.org/10.1128/JCM.01089-07

    Article  CAS  Google Scholar 

  • Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press, Cambridge

    Google Scholar 

  • Platt JC (1998) Sequential minimal optimization: a fast algorithm for training support vector machines. Microsoft Research

    Google Scholar 

  • Sajda P (2006) Machine learning for detection and diagnosis of disease. Annu Rev Biomed Eng 8:537–565. https://doi.org/10.1146/annurev.bioeng.8.061505.095802

    Article  CAS  PubMed  Google Scholar 

  • Simon P (2013) Too big to ignore: the business case for big data. Wiley, Hoboken

    Google Scholar 

  • Singh H, Kumar R, Singh S, Chaudhary K, Gautam A Raghava GP (2016) Prediction of anticancer molecules using hybrid model developed on molecules screened against NCI-60 cancer cell lines. BMC Cancer 16:77. https://doi.org/10.1186/s12885-016-2082-y10.1186/s12885-016-2082-y [pii]

  • Stuart R, Peter N (2003) Artificial intelligence: a modern approach, 2nd edn. Prentice Hall, Upper Saddle River

    Google Scholar 

  • Sutton R, Barto A (1998) Reinforcement learning: an introduction. MIT Press, Cambridge, MA

    Google Scholar 

  • Tiwari R, Singh MP (2010) Correlation-based attribute selection using genetic algorithm. Int J Comput Appl 4(8):0975–8887

    Google Scholar 

  • Tretyakov K (2004) Machine learning techniques in spam filtering. Institute of Computer Science, University of Tartu

    Google Scholar 

  • Valla A, Giraud M, Dore JC (1993) Descriptive modeling of the chemical structure-biological activity relations of a group of malonic polyethylenic acids as shown by different pharmacotoxicologic tests. Pharmazie 48(4):295–301

    CAS  PubMed  Google Scholar 

  • Yap CW (2011) PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32(7):1466–1474. https://doi.org/10.1002/jcc.21707

    Article  CAS  PubMed  Google Scholar 

  • Yong SL, Hagenbuchner M, Tsoi AC (2008) Ranking web pages using machine learning approaches. Web Intelligence and Intelligent Agent Technology, 2008 WI-IAT ‘08 IEEE/WIC/ACM International Conference 3:677–680

    Google Scholar 

Download references

Acknowledgments

Salma Jamal acknowledges a Senior Research Fellowship from the Indian Council of Medical Research (ICMR).

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Jamal, S., Goyal, S., Grover, A., Shanker, A. (2018). Machine Learning: What, Why, and How?. In: Shanker, A. (eds) Bioinformatics: Sequences, Structures, Phylogeny . Springer, Singapore. https://doi.org/10.1007/978-981-13-1562-6_16

Download citation

Publish with us

Policies and ethics