Abstract
Machine learning (ML) algorithms seek to extract the most beneficial information out of the raw data. As mentioned in previous chapters, preprocess data preparation actions might be needed in order to make the data set ready to be fed into the ML algorithms. ML algorithms have gotten the attention of researchers all over the world during the last few decades. In traditional algorithms, the analyst had to define the rules, which was not always accessible or easily possible, in order to obtain the output. ML algorithms develop the models or rules based on the training data set which include input and output data points. ML algorithms try to understand more beneficial information regarding the system based on the training data set. The built model can be tested using the verification data set which does not have any overlap with the training sets. If the model acquires the acceptable performance measures, it can be applied for other cases to perform the prediction process. It should be considered that ML algorithms are not supposed to do the magic. Indeed, they are supposed to explore and analyze the training data, which the human brain is not able to handle that, to develop a predictive model in which its performance is acceptable for the verification data set. Predictive models should be generalized which means performing satisfactorily for both training and verifying data sets.
The main purpose of this chapter is to present the most commonly used statistical and ML algorithms more toward the application side rather than the theories behind the development of the algorithms. Therefore, this chapter of the book is summarizing the most important applications and features of the ML algorithms. It should be noted that the term “machine learning (ML)” is not interchangeable with “artificial intelligence (AI).” Indeed, ML is a subfield of the AI which sometimes referred to as “predictive modeling” or “predictive algorithms.” One of the most famous theorems in ML area is called “no free lunch.” This theorem indicates that there is not an algorithm which can globally perform better than others for all the applications. Based on this theorem, an analyst should have advanced knowledge regarding the ability of the algorithms in order to select the most applicable one.
In this chapter, several commonly used supervised ML algorithms are presented in detail. In general, supervised algorithms are categorized into either regression or classification tasks. For each of the prediction task, the theories behind a few ML algorithms are explained. A few examples of the algorithm selection criteria are also discussed in this chapter. A numerical example is also presented for regression and classification tasks in order to explain the performance of the various algorithms given the same data set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
D. Michie, D.J. Spiegelhalter, C.C. Taylor, Machine Learning Neural and Statistical Classification, vol 13 (1994)
I.H. Witten, E. Frank, M.A. Hall, C.J. Pal, Data Mining: Practical Machine Learning Tools and Techniques (Morgan Kaufmann, Amsterdam, 2016)
E. Alpaydin, Introduction to Machine Learning (MIT press, 2009)
K.P. Murphy, Machine Learning: A Probabilistic Perspective (MIT Press, Cambridge, MA, 2012)
S.B. Kotsiantis, I. Zaharakis, P. Pintelas, Supervised machine learning: A review of classification techniques, in Emerging Artificial Intelligence Applications in Computer Engineering, vol. 160, (2007), pp. 3–24
L. Bottou, Large-scale machine learning with stochastic gradient descent, pp. 177–186, 2010
P.M. Domingos, A few useful things to know about machine learning. Commun. ACM 55(10), 78–87 (2012)
P. Baldi, S. Brunak, F. Bach, Bioinformatics: the Machine Learning Approach (MIT press, 2001)
M. Mohri, A. Rostamizadeh, A. Talwalkar, Foundations of Machine Learning (MIT Press, Cambridge, MA, 2018)
T. Mitchell, B. Buchanan, G. DeJong, T. Dietterich, P. Rosenbloom, A. Waibel, Machine learning. Annu. Rev. Comput. Sci 4(1), 417–433 (1990)
E. Alpaydin, Introduction to Machine Learning (MIT press, 2014)
S. Marsland, Machine Learning: An Algorithmic Perspective (Chapman and Hall/CRC, 2014)
M.I. Jordan, T.M. Mitchell, Machine learning: Trends, perspectives, and prospects. Science 349(6245), 255–260 (2015)
O. Chapelle, B. Scholkopf, A. Zien, Semi-supervised learning (chapelle, o. et al., eds.; 2006)[book reviews]. IEEE Trans. Neural Netw. 20(3), 542 (2009)
T. Hofmann, Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn 42(1-2), 177–196 (2001)
R.S. Sutton, A.G. Barto, Introduction to Reinforcement Learning, vol 135 (1998)
R.S. Sutton, A.G. Barto, Reinforcement Learning: An Introduction (MIT Press, Cambridge, MA, 2018)
G.A. Seber, A.J. Lee, Linear Regression Analysis, vol 329 (Wiley, New York, 2012)
D.C. Montgomery, E.A. Peck, G.G. Vining, Introduction to Linear Regression Analysis, vol 821 (John Wiley & Sons, 2012)
S. Weisberg, Applied Linear Regression, vol 528 (John Wiley & Sons, 2005)
J. Neter, M.H. Kutner, C.J. Nachtsheim, W. Wasserman, Applied Linear Statistical Models, vol 4 (Irwin, Chicago, 1996)
M.L. King, Testing for autocorrelation in linear regression models: A survey, in Specification Analysis in the Linear Model, (Routledge, 2018), pp. 19–73
C.B. Santiago, J. Guo, M.S. Sigman, Predictive and mechanistic multivariate linear regression models for reaction development. Chem. Sci. 9(9), 2398–2412 (2018)
A.F. Schmidt, C. Finan, Linear regression and the normality assumption. J. Clin. Epidemiol. 98, 146–151 (2018)
D.W. Hosmer Jr., S. Lemeshow, R.X. Sturdivant, Applied Logistic Regression, vol 398 (John Wiley & Sons, 2013)
P.D. Allison, Logistic Regression Using SAS: Theory and Application (SAS Institute, 2012)
S. Menard, S.W. Menard, Logistic Regression: From Introductory to Advanced Concepts and Applications (SAGE, Los Angeles, 2010)
J.J. Arsanjani, M. Helbich, W. Kainz, A.D. Boloorani, Integration of logistic regression, Markov chain and cellular automata models to simulate urban expansion. Int. J. Appl. Earth Obs. Geoinf. 21, 265–275 (2013)
J.C. Stoltzfus, Logistic regression: A brief primer. Acad. Emerg. Med. 18(10), 1099–1104 (2011)
J. Starkweather, A.K. Moske, Multinomial logistic regression, Consulted page at September 10th: http://www.unt.edu/rss/class/Jon/Benchmarks/MLR_JDS_Aug2011.pdf, vol. 29, pp. 2825–2830
S. Sperandei, Understanding logistic regression analysis. Biochem. Med 24(1), 12–18 (2014)
P.D. Allison, Measures of fit for logistic regression, 1–13
S. Menard, Standards for standardized logistic regression coefficients. Soc. Forces 89(4), 1409–1428 (2011)
Y. Freund, L. Mason, The alternating decision tree learning algorithm, vol. 99, pp. 124–133
M.A. Friedl, C.E. Brodley, Decision tree classification of land cover from remotely sensed data. Remote Sens. Environ. 61(3), 399–409 (1997)
T.K. Ho, Random decision forests, vol. 1, pp. 278–282
P.E. Utgoff, N.C. Berkman, J.A. Clouse, Decision tree induction based on efficient tree restructuring. Mach. Learn 29(1), 5–44 (1997)
J.R. Quinlan, Induction of decision trees. Mach. Learning 1(1), 81–106 (1986)
M. Mehta, J. Rissanen, R. Agrawal, MDL-based decision tree pruning, vol. 21, no. 2, pp. 216–221
R.E. Banfield, L.O. Hall, K.W. Bowyer, W.P. Kegelmeyer, A comparison of decision tree ensemble creation techniques. IEEE Trans. Pattern Anal. Mach. Intell. 29(1), 173–180 (2007)
D. Meyer, F.T. Wien, Support vector machines, The Interface to libsvm in package e1071, pp. 28
T. Harris, Credit scoring using the clustered support vector machine. Expert Syst. Appl. 42(2), 741–750 (2015)
B. Gu, V.S. Sheng, A robust regularization path algorithm for $\nu $-support vector classification. IEEE Trans. Neural Netw. Learn. Syst 28(5), 1241–1248 (2017)
S. Suthaharan, Support vector machine, pp. 207–235
T.R. Patil, S.S. Sherekar, Performance analysis of Naive Bayes and J48 classification algorithm for data classification. Int. J. Comput. Sci. Appl. 6(2), 256–261 (2013)
I. Rish, An empirical study of the naive Bayes classifier, vol. 3, no. 22, pp. 41–46
A. McCallum, K. Nigam, A comparison of event models for naive bayes text classification, vol. 752, no. 1, pp. 41–48
G. Ridgeway, D. Madigan, T. Richardson, J. O'Kane, Interpretable boosted Naïve Bayes classification, pp. 101–104
H. Zhang, The optimality of naive Bayes. AA 1(2), 3 (2004)
X. Yao, Evolving artificial neural networks. Proc. IEEE 87(9), 1423–1447 (1999)
J.M. Zurada, Introduction to Artificial Neural Systems, vol 8 (West publishing company, St. Paul, 1992)
W.S. Sarle, Neural networks and statistical models (1994)
M.H. Hassoun, Fundamentals of Artificial Neural Networks (MIT press, 1995)
S.J. Russell, P. Norvig, Artificial Intelligence: A Modern Approach (2016)
M. van Gerven, S. Bohte, Artificial Neural Networks as Models of Neural Information Processing (2018)
P. Bangalore, L.B. Tjernberg, An artificial neural network approach for early fault detection of gearbox bearings. IEEE Trans. Smart Grid 6(2), 980–987 (2015)
S. Shanmuganathan, Artificial neural network modelling: An introduction, in Artificial Neural Network Modelling, (Springer, Cham, 2016), pp. 1–14
G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, B. Kingsbury, Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Mag. 29, 82 (2012)
Z. Cai, Q. Fan, R.S. Feris, N. Vasconcelos, A unified multi-scale deep convolutional neural network for fast object detection, pp. 354–370
T. Do, A. Doan, N. Cheung, Learning to hash with binary deep neural network, pp. 219–234
M. Niepert, M. Ahmed, K. Kutzkov, Learning convolutional neural networks for graphs, pp. 2014–2023
O. Abdel-Hamid, A. Mohamed, H. Jiang, L. Deng, G. Penn, D. Yu, Convolutional neural networks for speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(10), 1533–1545 (2014)
N. Japkowicz, M. Shah, Evaluating Learning Algorithms: A Classification Perspective (Cambridge University Press, 2011)
S. Sra, S. Nowozin, S. J. Wright (eds.), Optimization for Machine Learning (MIT Press, 2012)
H.G. Schaathun, Machine Learning in Image Steganalysis (Wiley, Norway, 2012)
C. Zhang, Y. Ma (eds.), Ensemble Machine Learning: Methods and Applications (Springer, 2012)
R. Bekkerman, M. Bilenko, J. Langford (eds.), Scaling Up Machine Learning: Parallel and Distributed Approaches (Cambridge University Press, 2011)
G. Valentini, F. Masulli, Ensembles of learning machines, in Italian Workshop on Neural Nets, (Springer, Berlin, Heidelberg, 2002)
A. Coraddu, L. Oneto, A. Ghio, S. Savio, D. Anguita, M. Figari, Machine learning approaches for improving condition? based maintenance of naval propulsion plants. J. Eng. Mar. Environ 230, 136 (2014)
Center for Machine Learning and Intelligent Systems UCI Machine Learning Repository, Condition based maintenance of naval propulsion plants data set
UCI Machine Learning Repository Center for Machine Learning and Intelligent Systems, Electrical grid stability simulated data data set, November
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Balali, F., Nouri, J., Nasiri, A., Zhao, T. (2020). Machine Learning Principles. In: Data Intensive Industrial Asset Management. Springer, Cham. https://doi.org/10.1007/978-3-030-35930-0_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-35930-0_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-35929-4
Online ISBN: 978-3-030-35930-0
eBook Packages: EngineeringEngineering (R0)