Machine Learning: What, Why, and How?

Jamal, Salma; Goyal, Sukriti; Grover, Abhinav; Shanker, Asheesh

doi:10.1007/978-981-13-1562-6_16

Machine Learning: What, Why, and How?

Salma Jamal²,
Sukriti Goyal²,
Abhinav Grover³ &
…
Asheesh Shanker^2,4

Chapter
First Online: 14 October 2018

2570 Accesses
7 Citations
1 Altmetric

Abstract

Since the beginning of computers era, an enormous amount of data generated seems to be ever-increasing and could be of great use with efficient learning techniques. Learning from the data to make reliable predictions, discovering new patterns and theories has been the most challenging task for the researchers. Machine learning detects the hidden insights in the data, learns from them, and makes reliable predictions on the unseen data. It is used in a range of applications that include bioinformatics, cheminformatics, marketing, linguistics, email filtering, optical character recognition, and many more. In the present chapter, we have described the types of learning, applications of machine learning, steps to generate machine learning models using various learning algorithms, and validation of generated models. Additionally, step-by-step generation of models using Weka workbench which is a collection of machine learning algorithms and data preprocessing tools has also been discussed.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Alex S, Vishwanathan SVN (2008) Introduction to machine learning. Cambridge University Press, Cambridge
Google Scholar
Ali J, Khan R, Ahmad N, Maqsood I (2012) Random forests and decision trees. Int J Comput Sci Issues 9(5). JOELib/JOELib2 cheminformatics library
Google Scholar
Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185
Google Scholar
Bishop CM (2006) Pattern recognition and machine learning. In: Information science and statistics. Springer, New York
Google Scholar
Brunelli R, Poggio T (1993) Face recognition: features versus templates. IEEE Trans Pattern Anal Mach Intell 15(10):1042–1052
Article Google Scholar
Chemical Computing Group Inc (2015) Molecular operating environment (MOE). 2013.08 edn., Sherbooke St. West, Suite #910, Montreal, QC, Canada
Google Scholar
Cheng J, Tegge AN, Baldi P (2008) Machine learning methods for protein structure prediction. IEEE Rev Biomed Eng 1:41–49. https://doi.org/10.1109/RBME.2008.2008239
Article PubMed Google Scholar
Christopher B (2006) Pattern recognition and machine learning. In: Information science and statistics. Springer, New York
Google Scholar
Dang V, Croft WB (2010) Feature selection for document ranking using best first search and coordinate ascent. In: Proceedings of SIGIR workshop on feature generation and selection for information retrieval
Google Scholar
Daumé H (2012) A course in machine learning. ciml.info
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Google Scholar
Dervisevic I (2006) Machine learning methods for optical character recognition. pp 1–25
Google Scholar
Ethem A (2009) Introduction to machine learning. The MIT Press, Cambridge
Google Scholar
Farahat AK, Ghodsi A, Kamel MS (2011) An efficient greedy method for unsupervised feature selection. In: 11th IEEE international conference on data mining
Google Scholar
Friedman N, Geiger D, GoldSzmidt M (1997) Bayesian network classifiers. Mach Learn 29:131–163
Article Google Scholar
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Google Scholar
Hsu C-W, Chang C-C, Lin C-J (2003) A practical guide to support vector classification. National Taiwan University
Google Scholar
Jamal S, Goyal S, Shanker A, Grover A (2017) Computational screening and exploration of disease-associated genes in Alzheimer’s disease. J Cell Biochem 118(6):1471–1479. https://doi.org/10.1002/jcb.25806
Article CAS PubMed Google Scholar
Karuppasamy S, Indradevi MR, Rajaram R (2008) Combined feature selection and classification – a novel approach for the categorization of web pages. J Inf Comput Sci 3(2):083–089
Google Scholar
Kohavi R, Provost F (1998) Glossary of terms. Mach Learn 30:271–274
Article Google Scholar
Liu K, Feng J, Young SS (2005) PowerMV: a software environment for molecular viewing, descriptor generation, data analysis and hit evaluation. J Chem Inf Model 45(2):515–522. https://doi.org/10.1021/ci049847v
Article CAS PubMed Google Scholar
López FG, Torres MG, Batista BM, JAM P, Moreno-Vega JM (2006) Solving feature subset selection problem by a parallel scatter search. Eur J Oper Res 169(2):477–489
Article Google Scholar
Mitchell TM (1997) Machine learning. McGraw-Hill Science/Engineering/Math, Maidenhead
Google Scholar
Mitchell JB (2014) Machine learning methods in chemoinformatics. Wiley Interdiscip Rev Comput Mol Sci 4(5):468–481. https://doi.org/10.1002/wcms.1183
Article CAS PubMed PubMed Central Google Scholar
Mohri M, Rostamizadeh A, Talwalkar A (2012) Foundations of machine learning. The MIT Press, Cambridge (MA)/London
Google Scholar
Moore CL, Smagala JA, Smith CB, Dawson ED, Cox NJ, Kuchta RD Rowlen KL (2007) Evaluation of MChip with historic subtype H1N1 influenza A viruses, including the 1918 “Spanish Flu” strain. J Clin Microbiol 45 (11):3807-3810. JCM.01089-07 [pii]https://doi.org/10.1128/JCM.01089-07
Article CAS Google Scholar
Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press, Cambridge
Google Scholar
Platt JC (1998) Sequential minimal optimization: a fast algorithm for training support vector machines. Microsoft Research
Google Scholar
Sajda P (2006) Machine learning for detection and diagnosis of disease. Annu Rev Biomed Eng 8:537–565. https://doi.org/10.1146/annurev.bioeng.8.061505.095802
Article CAS PubMed Google Scholar
Simon P (2013) Too big to ignore: the business case for big data. Wiley, Hoboken
Google Scholar
Singh H, Kumar R, Singh S, Chaudhary K, Gautam A Raghava GP (2016) Prediction of anticancer molecules using hybrid model developed on molecules screened against NCI-60 cancer cell lines. BMC Cancer 16:77. https://doi.org/10.1186/s12885-016-2082-y10.1186/s12885-016-2082-y [pii]
Stuart R, Peter N (2003) Artificial intelligence: a modern approach, 2nd edn. Prentice Hall, Upper Saddle River
Google Scholar
Sutton R, Barto A (1998) Reinforcement learning: an introduction. MIT Press, Cambridge, MA
Google Scholar
Tiwari R, Singh MP (2010) Correlation-based attribute selection using genetic algorithm. Int J Comput Appl 4(8):0975–8887
Google Scholar
Tretyakov K (2004) Machine learning techniques in spam filtering. Institute of Computer Science, University of Tartu
Google Scholar
Valla A, Giraud M, Dore JC (1993) Descriptive modeling of the chemical structure-biological activity relations of a group of malonic polyethylenic acids as shown by different pharmacotoxicologic tests. Pharmazie 48(4):295–301
CAS PubMed Google Scholar
Yap CW (2011) PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32(7):1466–1474. https://doi.org/10.1002/jcc.21707
Article CAS PubMed Google Scholar
Yong SL, Hagenbuchner M, Tsoi AC (2008) Ranking web pages using machine learning approaches. Web Intelligence and Intelligent Agent Technology, 2008 WI-IAT ‘08 IEEE/WIC/ACM International Conference 3:677–680
Google Scholar

Download references

Acknowledgments

Salma Jamal acknowledges a Senior Research Fellowship from the Indian Council of Medical Research (ICMR).

Author information

Authors and Affiliations

Department of Bioscience and Biotechnology, Banasthali Vidyapith, Rajasthan, India
Salma Jamal, Sukriti Goyal & Asheesh Shanker
School of Biotechnology, Jawaharlal Nehru University, New Delhi, India
Abhinav Grover
Department of Bioinformatics, Central University of South Bihar, Gaya, Bihar, India
Asheesh Shanker

Authors

Salma Jamal
View author publications
You can also search for this author in PubMed Google Scholar
Sukriti Goyal
View author publications
You can also search for this author in PubMed Google Scholar
Abhinav Grover
View author publications
You can also search for this author in PubMed Google Scholar
Asheesh Shanker
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Bioinformatics, Central University of South Bihar, Gaya, Bihar, India
Asheesh Shanker

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jamal, S., Goyal, S., Grover, A., Shanker, A. (2018). Machine Learning: What, Why, and How?. In: Shanker, A. (eds) Bioinformatics: Sequences, Structures, Phylogeny . Springer, Singapore. https://doi.org/10.1007/978-981-13-1562-6_16

Download citation

DOI: https://doi.org/10.1007/978-981-13-1562-6_16
Published: 14 October 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1561-9
Online ISBN: 978-981-13-1562-6
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics