Abstract
Since the beginning of computers era, an enormous amount of data generated seems to be ever-increasing and could be of great use with efficient learning techniques. Learning from the data to make reliable predictions, discovering new patterns and theories has been the most challenging task for the researchers. Machine learning detects the hidden insights in the data, learns from them, and makes reliable predictions on the unseen data. It is used in a range of applications that include bioinformatics, cheminformatics, marketing, linguistics, email filtering, optical character recognition, and many more. In the present chapter, we have described the types of learning, applications of machine learning, steps to generate machine learning models using various learning algorithms, and validation of generated models. Additionally, step-by-step generation of models using Weka workbench which is a collection of machine learning algorithms and data preprocessing tools has also been discussed.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Alex S, Vishwanathan SVN (2008) Introduction to machine learning. Cambridge University Press, Cambridge
Ali J, Khan R, Ahmad N, Maqsood I (2012) Random forests and decision trees. Int J Comput Sci Issues 9(5). JOELib/JOELib2 cheminformatics library
Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185
Bishop CM (2006) Pattern recognition and machine learning. In: Information science and statistics. Springer, New York
Brunelli R, Poggio T (1993) Face recognition: features versus templates. IEEE Trans Pattern Anal Mach Intell 15(10):1042–1052
Chemical Computing Group Inc (2015) Molecular operating environment (MOE). 2013.08 edn., Sherbooke St. West, Suite #910, Montreal, QC, Canada
Cheng J, Tegge AN, Baldi P (2008) Machine learning methods for protein structure prediction. IEEE Rev Biomed Eng 1:41–49. https://doi.org/10.1109/RBME.2008.2008239
Christopher B (2006) Pattern recognition and machine learning. In: Information science and statistics. Springer, New York
Dang V, Croft WB (2010) Feature selection for document ranking using best first search and coordinate ascent. In: Proceedings of SIGIR workshop on feature generation and selection for information retrieval
Daumé H (2012) A course in machine learning. ciml.info
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Dervisevic I (2006) Machine learning methods for optical character recognition. pp 1–25
Ethem A (2009) Introduction to machine learning. The MIT Press, Cambridge
Farahat AK, Ghodsi A, Kamel MS (2011) An efficient greedy method for unsupervised feature selection. In: 11th IEEE international conference on data mining
Friedman N, Geiger D, GoldSzmidt M (1997) Bayesian network classifiers. Mach Learn 29:131–163
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Hsu C-W, Chang C-C, Lin C-J (2003) A practical guide to support vector classification. National Taiwan University
Jamal S, Goyal S, Shanker A, Grover A (2017) Computational screening and exploration of disease-associated genes in Alzheimer’s disease. J Cell Biochem 118(6):1471–1479. https://doi.org/10.1002/jcb.25806
Karuppasamy S, Indradevi MR, Rajaram R (2008) Combined feature selection and classification – a novel approach for the categorization of web pages. J Inf Comput Sci 3(2):083–089
Kohavi R, Provost F (1998) Glossary of terms. Mach Learn 30:271–274
Liu K, Feng J, Young SS (2005) PowerMV: a software environment for molecular viewing, descriptor generation, data analysis and hit evaluation. J Chem Inf Model 45(2):515–522. https://doi.org/10.1021/ci049847v
López FG, Torres MG, Batista BM, JAM P, Moreno-Vega JM (2006) Solving feature subset selection problem by a parallel scatter search. Eur J Oper Res 169(2):477–489
Mitchell TM (1997) Machine learning. McGraw-Hill Science/Engineering/Math, Maidenhead
Mitchell JB (2014) Machine learning methods in chemoinformatics. Wiley Interdiscip Rev Comput Mol Sci 4(5):468–481. https://doi.org/10.1002/wcms.1183
Mohri M, Rostamizadeh A, Talwalkar A (2012) Foundations of machine learning. The MIT Press, Cambridge (MA)/London
Moore CL, Smagala JA, Smith CB, Dawson ED, Cox NJ, Kuchta RD Rowlen KL (2007) Evaluation of MChip with historic subtype H1N1 influenza A viruses, including the 1918 “Spanish Flu” strain. J Clin Microbiol 45 (11):3807-3810. JCM.01089-07 [pii]https://doi.org/10.1128/JCM.01089-07
Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press, Cambridge
Platt JC (1998) Sequential minimal optimization: a fast algorithm for training support vector machines. Microsoft Research
Sajda P (2006) Machine learning for detection and diagnosis of disease. Annu Rev Biomed Eng 8:537–565. https://doi.org/10.1146/annurev.bioeng.8.061505.095802
Simon P (2013) Too big to ignore: the business case for big data. Wiley, Hoboken
Singh H, Kumar R, Singh S, Chaudhary K, Gautam A Raghava GP (2016) Prediction of anticancer molecules using hybrid model developed on molecules screened against NCI-60 cancer cell lines. BMC Cancer 16:77. https://doi.org/10.1186/s12885-016-2082-y10.1186/s12885-016-2082-y [pii]
Stuart R, Peter N (2003) Artificial intelligence: a modern approach, 2nd edn. Prentice Hall, Upper Saddle River
Sutton R, Barto A (1998) Reinforcement learning: an introduction. MIT Press, Cambridge, MA
Tiwari R, Singh MP (2010) Correlation-based attribute selection using genetic algorithm. Int J Comput Appl 4(8):0975–8887
Tretyakov K (2004) Machine learning techniques in spam filtering. Institute of Computer Science, University of Tartu
Valla A, Giraud M, Dore JC (1993) Descriptive modeling of the chemical structure-biological activity relations of a group of malonic polyethylenic acids as shown by different pharmacotoxicologic tests. Pharmazie 48(4):295–301
Yap CW (2011) PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32(7):1466–1474. https://doi.org/10.1002/jcc.21707
Yong SL, Hagenbuchner M, Tsoi AC (2008) Ranking web pages using machine learning approaches. Web Intelligence and Intelligent Agent Technology, 2008 WI-IAT ‘08 IEEE/WIC/ACM International Conference 3:677–680
Acknowledgments
Salma Jamal acknowledges a Senior Research Fellowship from the Indian Council of Medical Research (ICMR).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Jamal, S., Goyal, S., Grover, A., Shanker, A. (2018). Machine Learning: What, Why, and How?. In: Shanker, A. (eds) Bioinformatics: Sequences, Structures, Phylogeny . Springer, Singapore. https://doi.org/10.1007/978-981-13-1562-6_16
Download citation
DOI: https://doi.org/10.1007/978-981-13-1562-6_16
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1561-9
Online ISBN: 978-981-13-1562-6
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)