Skip to main content

Brief Introduction to Statistical Machine Learning

  • Chapter
  • First Online:
Empirical Approach to Machine Learning

Part of the book series: Studies in Computational Intelligence ((SCI,volume 800))

Abstract

In this chapter, an overview of the theory of probability, statistical and machine learning is made covering the main ideas and the most popular and widely used methods in this area. As a starting point, the randomness and determinism as well as the nature of the real-world problems are discussed. Then, the basic and well-known topics of the traditional probability theory and statistics including the probability mass and distribution, probability density and moments, density estimation, Bayesian and other branches of the probability theory, are recalled followed by a analysis. The well-known data pre-processing techniques, unsupervised and supervised machine learning methods are covered. These include a brief introduction of the distance metrics, normalization and standardization, feature selection, orthogonalization as well as a review of the most representative clustering, classification, regression and prediction approaches of various types. In the end, the topic of image processing is also briefly covered including the popular image transformation techniques, and a number of image feature extraction techniques at three different levels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. G. Grimmett, D. Welsh, Probability: an Introduction (Oxford University Press, 2014)

    Google Scholar 

  2. P. Angelov, S. Sotirov (eds.), Imprecision and Uncertainty in Information Representation and Processing (Springer, Cham, 2015)

    Google Scholar 

  3. C.M. Bishop, Pattern Recognition and Machine Learning (Springer, New York, 2006)

    MATH  Google Scholar 

  4. R.O. Duda, P.E. Hart, D.G. Stork, Pattern Classification, 2nd edn. (Chichester, West Sussex, UK,: Wiley-Interscience, 2000)

    Google Scholar 

  5. M.S. de Alencar, R.T. de Alencar, Probability Theory (Momentum Press, New York, 2016)

    Google Scholar 

  6. P. Angelov, Autonomous Learning Systems: From Data Streams to Knowledge in Real Time (Wiley, Ltd., 2012)

    Book  Google Scholar 

  7. J. Nicholson, The Concise Oxford Dictionary of Mathematics, 5th edn. (Oxford University Press, 2014)

    Google Scholar 

  8. S. Haykin, Communication Systems (Wiley, 2008)

    Google Scholar 

  9. W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical Recipes: The Art of Scientific Computing, 3rd edn. (Cambridge university press, 2007)

    Google Scholar 

  10. J.-M. Marin, K. Mengersen, C.P. Robert, Bayesian modelling and inference on mixtures of distributions, in Handbook of statistics (2005), pp. 459–507

    Google Scholar 

  11. T.K. Moon, The expectation-maximization algorithm. IEEE Signal Process. Mag. 13(6), 47–60 (1996)

    Article  Google Scholar 

  12. T. Bayes, An essay towards solving a problem in the doctrine of chances. Philos. Trans. R. Soc. 53, 370 (1763)

    Google Scholar 

  13. J. Principe, Information Theoretic Learning: Renyi’s Entropy and Kernel Perspectives (Springer, 2010)

    Google Scholar 

  14. T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Springer, Burlin, 2009)

    Book  MATH  Google Scholar 

  15. V. Vapnik, R. Izmailov, Statistical inference problems and their rigorous solutions. Stat. Learn. Data Sci. 9047, 33–71 (2015)

    Article  MATH  Google Scholar 

  16. S.B. Kotsiantis, D. Kanellopoulos, P.E. Pintelas, Data preprocessing for supervised learning. Int. J. Comput. Sci. 1(2), 111–117 (2006)

    Google Scholar 

  17. M. Kuhn, K. Johnson, Data pre-processing, in Applied Predictive Modeling (Springer, New York, NY, 2013) pp. 27–59

    Chapter  MATH  Google Scholar 

  18. X. Gu, P.P. Angelov, D. Kangin, J.C. Principe, A new type of distance metric and its use for clustering. Evol. Syst. 8(3), 167–178 (2017)

    Article  Google Scholar 

  19. B. McCune, J.B. Grace, D.L. Urban, Analysis of Ecological Communities (2002)

    Google Scholar 

  20. F.A. Allah, W.I. Grosky, D. Aboutajdine, Document clustering based on diffusion maps and a comparison of the k-means performances in various spaces, in IEEE Symposium on Computers and Communications, 2008, pp. 579–584

    Google Scholar 

  21. N. Dehak, R. Dehak, J. Glass, D. Reynolds, P. Kenny, “Cosine Similarity Scoring without Score Normalization Techniques,” in Proceedings of Odyssey 2010—The Speaker and Language Recognition Workshop (Odyssey 2010), 2010, pp. 71–75

    Google Scholar 

  22. N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, P. Ouellet, Front end factor analysis for speaker verification. IEEE Trans. Audio. Speech. Lang. Process. 19(4), 788–798 (2011)

    Article  Google Scholar 

  23. V. Setlur, M.C. Stone, A linguistic approach to categorical color assignment for data visualization. IEEE Trans. Vis. Comput. Graph. 22(1), 698–707 (2016)

    Article  Google Scholar 

  24. M. Senoussaoui, P. Kenny, P. Dumouchel, T. Stafylakis, Efficient iterative mean shift based cosine dissimilarity for multi-recording speaker clustering, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013 pp. 7712–7715

    Google Scholar 

  25. J. Zhang, Divergence function, duality, and convex analysis. Neural Comput. 16(1), 159–195 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  26. S. Eguchi, A differential geometric approach to statistical inference on the basis of contrast functionals. Hiroshima Math. J. 15(2), 341–391 (1985)

    MathSciNet  MATH  Google Scholar 

  27. J.R. Hershey, P.A. Olsen, Approximating the Kullback Leibler divergence between Gaussian mixture models, in IEEE International Conference on Acoustics, Speech and Signal Processing, 2007, pp. 317–320

    Google Scholar 

  28. R.G. Brereton, The mahalanobis distance and its relationship to principal component scores. J. Chemom. 29(3), 143–145 (2015)

    Article  Google Scholar 

  29. R.R. Korfhage, J. Zhang, A distance and angle similarity measure method. J. Am. Soc. Inf. Sci. 50(9), 772–778 (1999)

    Article  Google Scholar 

  30. X. Gu, P. Angelov, D. Kangin, J. Principe, Self-organised direction aware data partitioning algorithm. Inf. Sci. (Ny) 423, 80–95 (2018)

    Article  MathSciNet  Google Scholar 

  31. R.A. Horn, C.R. Johnson, Matrix Analysis (Cambridge University Press, 1990)

    Google Scholar 

  32. C.C. Aggarwal, A. Hinneburg, D.A. Keim, On the surprising behavior of distance metrics in high dimensional space, in International Conference on Database Theory, 2001, pp. 420–434

    Google Scholar 

  33. K. Beyer, J. Goldstein, R. Ramakrishnan, U. Shaft, When is ‘nearest neighbors’ meaningful?, in International Conference on Database Theoryheory, 1999, pp. 217–235

    Google Scholar 

  34. J.G. Saw, M.C.K. Yang, T.S.E.C. Mo, Chebyshev inequality with estimated mean and variance. Am. Stat. 38(2), 130–132 (1984)

    MathSciNet  Google Scholar 

  35. G. Kumar, P.K. Bhatia, A detailed review of feature extraction in image processing systems, in IEEE International Conference on Advanced Computing and Communication Technologies, 2014, pp. 5–12

    Google Scholar 

  36. S.T.K. Koutroumbas, Pattern Recognition, 4th edn. (Elsevier, New York, 2009)

    MATH  Google Scholar 

  37. I. Guyon, A. Elisseeff, An introduction to variable and feature selection. J. Mach. Learn. Res. 3(3), 1157–1182 (2003)

    MATH  Google Scholar 

  38. J. Trevisan, P.P. Angelov, A.D. Scott, P.L. Carmichael, F.L. Martin, IRootLab: a free and open-source MATLAB toolbox for vibrational biospectroscopy data analysis. Bioinformatics 29(8), 1095–1097 (2013)

    Article  Google Scholar 

  39. X. Zhang, M.A. Young, O. Lyandres, R.P. Van Duyne, Rapid detection of an anthrax biomarker by surface-enhanced Raman spectroscopy. J. Am. Chem. Soc. 127(12), 4484–4489 (2005)

    Article  Google Scholar 

  40. P.C. Sundgren, V. Nagesh, A. Elias, C. Tsien, L. Junck, D.M.G. Hassan, T.S. Lawrence, T.L. Chenevert, L. Rogers, P. McKeever, Y. Cao, Metabolic alterations: a biomarker for radiation induced normal brain injury-an MR spectroscopy study. J. Magn. Reson. Imaging 29(2), 291–297 (2009)

    Article  Google Scholar 

  41. G.H. Golub, C. Reinsch, Singular value decomposition and least squares solutions. Numer. Math. 14(5), 403–420 (1970)

    Article  MathSciNet  MATH  Google Scholar 

  42. J. Scheffer, Dealing with missing data. Res. Lett. Inf. Math. Sci. 3, 153–160 (2002)

    Google Scholar 

  43. A. Agresti, Categorical Data Analysis (Wiley, 2003)

    Google Scholar 

  44. O. Maimon, L. Rokach, Data Mining and Knowledge Discovery Handbook (Springer, Boston, MA, 2005)

    Book  MATH  Google Scholar 

  45. C.C. Aggarwal, C.K. Reddy (eds.), Data Clustering: Algorithms and Applications (CRC Press, 2013)

    Google Scholar 

  46. S.C. Johnson, Hierarchical clustering schemes. Psychometrika 32(3), 241–254 (1967)

    Article  MATH  Google Scholar 

  47. R.A. Fisher, The use of multiple measurements in taxonomic problems. Ann. Eugen. 7(2), 179–188 (1936)

    Article  Google Scholar 

  48. http://archive.ics.uci.edu/ml/datasets/Iris

  49. G. Karypis, E.-H. Han, V. Kumar, Chameleon: hierarchical clustering using dynamic modeling. Comput. (Long. Beach. Calif) 32(8), 68–75 (1999)

    Article  Google Scholar 

  50. W.H.E. Day, H. Edelsbrunner, Efficient algorithms for agglomerative hierarchical clustering methods. J. Classif. 1, 7–24 (1984)

    Article  MATH  Google Scholar 

  51. A. Gucnoche, P. Hansen, B. Jaumard, Efficient algorithms for divisive hierarchical clustering with the diameter criterion. J. Classif. 8, 5–30 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  52. T. Xiong, S. Wang, A. Mayers, E. Monga, DHCC: divisive hierarchical clustering of categorical data. Data Min. Knowl. Discov. 24, 103–135 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  53. T. Zhang, R. Ramakrishnan, M. Livny, BIRCH: a new data clustering algorithm and its applications. Data Min. Knowl. Discov. 1(2), 141–182 (1997)

    Article  Google Scholar 

  54. B.J. Frey, D. Dueck, Clustering by passing messages between data points, Science (80-.) 315(5814), pp. 972–976 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  55. J.B. MacQueen, Some methods for classification and analysis of multivariate observations, in 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, no. 233, (1967) pp. 281–297

    Google Scholar 

  56. M. Ester, H.P. Kriegel, J. Sander, X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, in International Conference on Knowledge Discovery and Data Mining, vol. 96 (1996) pp. 226–231

    Google Scholar 

  57. D. Comaniciu, P. Meer, Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)

    Article  Google Scholar 

  58. K.L. Wu, M.S. Yang, Mean shift-based clustering. Pattern Recognit. 40(11), 3035–3052 (2007)

    Article  MATH  Google Scholar 

  59. R. Dutta Baruah, P. Angelov, Evolving local means method for clustering of streaming data, in IEEE International Conference on Fuzzy Systems, 2012, pp. 10–15

    Google Scholar 

  60. P. Angelov, An approach for fuzzy rule-base adaptation using on-line clustering. Int. J. Approx. Reason. 35(3), 275–289 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  61. P.P. Angelov, D.P. Filev, N.K. Kasabov, Evolving Intelligent Systems: Methodology and Applications (2010)

    Google Scholar 

  62. R. Hyde, P. Angelov, A fully autonomous data density based clustering technique, in IEEE Symposium on Evolving and Autonomous Learning Systems, 2014, pp. 116–123

    Google Scholar 

  63. R. Hyde, P. Angelov, A.R. MacKenzie, Fully online clustering of evolving data streams into arbitrarily shaped clusters. Inf. Sci. (Ny) 382–383, 96–114 (2017)

    Article  Google Scholar 

  64. A. Corduneanu, C.M. Bishop, Variational Bayesian model selection for mixture distributions, in Proceedings of the Eighth International Joint Conference on Artificial statistics, 2001, pp. 27–34

    Google Scholar 

  65. C.A. McGrory, D.M. Titterington, Variational approximations in Bayesian model selection for finite mixture distributions. Comput. Stat. Data Anal. 51(11), 5352–5367 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  66. D.M. Blei, M.I. Jordan, Variational methods for the Dirichlet process, in Proceedings of the Twenty-First International Conference on Machine Learning, 2004, p. 12

    Google Scholar 

  67. D.M. Blei, M.I. Jordan, Variational inference for Dirichlet process mixtures. Bayesian Anal. 1(1A), 121–144 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  68. J.C. Dunn, A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3(3) (1973)

    Article  MathSciNet  MATH  Google Scholar 

  69. J.C. Dunn, Well-separated clusters and optimal fuzzy partitions. J. Cybern. 4(1), 95–104 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  70. P.P. Angelov, D.P. Filev, An approach to online identification of Takagi-Sugeno fuzzy models. IEEE Trans. Syst. Man, Cybern. Part B Cybern. 34(1), 484–498 (2004)

    Article  Google Scholar 

  71. M.N. Murty, V.S. Devi, Introduction to Pattern Recognition and Machine Learning (World Scientific, 2015)

    Google Scholar 

  72. P. Angelov, X. Zhou, D. Filev, E. Lughofer, Architectures for evolving fuzzy rule-based classifiers, in IEEE International Conference on Systems, Man and Cybernetics, 2007, pp. 2050–2055

    Google Scholar 

  73. P. Angelov, X. Zhou, Evolving fuzzy-rule based classifiers from data streams. IEEE Trans. Fuzzy Syst. 16(6), 1462–1474 (2008)

    Article  Google Scholar 

  74. P. Angelov, Fuzzily connected multimodel systems evolving autonomously from data streams. IEEE Trans. Syst. Man, Cybern. Part B Cybern. 41(4), 898–910 (2011)

    Article  Google Scholar 

  75. X. Gu, P.P. Angelov, Semi-supervised deep rule-based approach for image classification. Appl. Soft Comput. 68, 53–68 (2018)

    Article  Google Scholar 

  76. T. Cover, P. Hart, Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)

    Article  MATH  Google Scholar 

  77. P. Cunningham, S.J. Delany, K-nearest neighbour classifiers. Mult. Classif. Syst. 34, 1–17 (2007)

    Google Scholar 

  78. K. Fukunage, P.M. Narendra, A branch and bound algorithm for computing k-nearest neighbors. IEEE Trans. Comput. C-24(7), 750–753 (1975)

    Google Scholar 

  79. N. Cristianini, J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods (Cambridge University Press, Cambridge, 2000)

    Book  MATH  Google Scholar 

  80. W.S. Noble, What is a support vector machine? Nat. Biotechnol. 24(12), 1565–1567 (2006)

    Article  Google Scholar 

  81. V. Vapnik, A. Lerner, Pattern recognition using generalized portrait method. Autom. Remote Control 24(6), 774–780 (1963)

    Google Scholar 

  82. C.J.C. Burges, A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov. 2(2), 121–167 (1998)

    Article  Google Scholar 

  83. D. Kangin, P. Angelov, Recursive SVM based on TEDA, in International Symposium on Statistical Learning and Data Sciences, 2015, pp. 156–168

    Chapter  Google Scholar 

  84. L.A. Zadeh, Fuzzy sets. Inf. Control 8(3), 338–353 (1965)

    Article  MATH  Google Scholar 

  85. E.H. Mamdani, S. Assilian, An experiment in linguistic synthesis with a fuzzy logic controller. Int. J. Man Mach. Stud. 7(1), 1–13 (1975)

    Article  MATH  Google Scholar 

  86. T. Takagi, M. Sugeno, Fuzzy identification of systems and its applications to modeling and control. IEEE Trans. Syst. Man. Cybern. 15(1), 116–132 (1985)

    Article  MATH  Google Scholar 

  87. H. Ishibuchi, K. Nozaki, H. Tanaka, Distributed representation of fuzzy rules and its application to pattern classification. Fuzzy Sets Syst. 52(1), 21–32 (1992)

    Article  Google Scholar 

  88. H. Ishibuchi, K. Nozaki, N. Yamamoto, H. Tanaka, Selecting fuzzy if-then rules for classification problems using genetic algorithms. IEEE Trans. Fuzzy Syst. 3(3), 260–270 (1995)

    Article  Google Scholar 

  89. L. Kuncheva, Combining Pattern Classifiers: Methods and Algorithms (Wiley, Hoboken, New Jersey, 2004)

    Book  MATH  Google Scholar 

  90. H. Ishibuchi, T. Nakashima, M. Nii, Classification and Modeling with Linguistic Information Granules: Advanced Approaches to Linguistic Data Mining (Springer Science & Business Media, 2006)

    Google Scholar 

  91. C. Xydeas, P. Angelov, S.Y. Chiao, M. Reoullas, Advances in classification of EEG signals via evolving fuzzy classifiers and dependant multiple HMMs. Comput. Biol. Med. 36(10), 1064–1083 (2006)

    Article  Google Scholar 

  92. R.D. Baruah, P.P. Angelov, J. Andreu, Simpl _ eClass : simplified potential-free evolving fuzzy rule-based classifiers, in IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2011, pp. 2249–2254

    Google Scholar 

  93. P. Angelov, D. Kangin, D. Kolev, Symbol recognition with a new autonomously evolving classifier AutoClass, in IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), 2014, pp. 1–7

    Google Scholar 

  94. D. Kangin, P. Angelov, J.A. Iglesias, Autonomously evolving classifier TEDAClass. Inf. Sci. (Ny) 366, 1–11 (2016)

    Article  MathSciNet  Google Scholar 

  95. P. Angelov, E. Lughofer, X. Zhou, Evolving fuzzy classifiers using different model architectures. Fuzzy Sets Syst. 159(23), 3160–3182 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  96. M. Pratama, S.G. Anavatti, P.P. Angelov, E. Lughofer, PANFIS: a novel incremental learning machine. IEEE Trans. Neural Networks Learn. Syst. 25(1), 55–68 (2014)

    Article  Google Scholar 

  97. M. Pratama, S.G. Anavatti, E. Lughofer, Genefis: toward an effective localist network. IEEE Trans. Fuzzy Syst. 22(3), 547–562 (2014)

    Article  Google Scholar 

  98. T. Isobe, E.D. Feigelson, M.G. Akritas, G.J. Babu, Linear regression in astronomy. Astrophys. J. 364, 104–113 (1990)

    Article  Google Scholar 

  99. R.E. Precup, H.I. Filip, M.B. Rədac, E.M. Petriu, S. Preitl, C.A. Dragoş, Online identification of evolving Takagi-Sugeno-Kang fuzzy models for crane systems. Appl. Soft Comput. J. 24, 1155–1163 (2014)

    Article  Google Scholar 

  100. V. Bianco, O. Manca, S. Nardini, Electricity consumption forecasting in Italy using linear regression models. Energy 34(9), 1413–1421 (2009)

    Article  Google Scholar 

  101. X. Gu, P.P. Angelov, A.M. Ali, W.A. Gruver, G. Gaydadjiev, Online evolving fuzzy rule-based prediction model for high frequency trading financial data stream, in IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), 2016, pp. 169–175

    Google Scholar 

  102. X. Yan, X. Su, Linear Regression Analysis: Theory and Computing (World Scientific, 2009)

    Google Scholar 

  103. J.S.R. Jang, ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans. Syst. Man Cybern. 23(3), 665–685 (1993)

    Article  Google Scholar 

  104. P. Angelov, R. Buswell, Identification of evolving fuzzy rule-based models. IEEE Trans. Fuzzy Syst. 10(5), 667–677 (2002)

    Article  Google Scholar 

  105. P. Angelov, R. Buswell, Evolving rule-based models: a tool for intelligent adaption, in IFSA World Congress and 20th NAFIPS International Conference, 2001, pp. 1062–1067

    Google Scholar 

  106. P. Angelov, D. Filev, On-line design of takagi-sugeno models, in International Fuzzy Systems Association World Congress (Springer, Berlin, Heidelberg, 2003), pp. 576–584

    Google Scholar 

  107. N.K. Kasabov, Q. Song, DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction. IEEE Trans. Fuzzy Syst. 10(2), 144–154 (2002)

    Article  Google Scholar 

  108. E.D. Lughofer, FLEXFIS: a robust incremental learning approach for evolving Takagi-Sugeno fuzzy models. IEEE Trans. Fuzzy Syst. 16(6), 1393–1410 (2008)

    Article  Google Scholar 

  109. H.J. Rong, N. Sundararajan, G. Bin Huang, P. Saratchandran, Sequential adaptive fuzzy inference system (SAFIS) for nonlinear system identification and prediction. Fuzzy Sets Syst. 157(9), 1260–1275 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  110. H.J. Rong, N. Sundararajan, G. Bin Huang, G.S. Zhao, Extended sequential adaptive fuzzy inference system for classification problems. Evol. Syst. 2(2), 71–82 (2011)

    Article  Google Scholar 

  111. R. Bao, H. Rong, P.P. Angelov, B. Chen, P.K. Wong, Correntropy-based evolving fuzzy neural system. IEEE Trans. Fuzzy Syst. (2017). https://doi.org/10.1109/TFUZZ.2017.2719619

    Article  Google Scholar 

  112. D. Leite, P. Costa, F. Gomide, Interval approach for evolving granular system modeling, in Learning in Non-stationary Environments (New York, NY: Springer, 2012), pp. 271–300

    Chapter  Google Scholar 

  113. W. Leigh, R. Hightower, N. Modani, Forecasting the New York stock exchange composite index with past price and interest rate on condition of volume spike. Expert Syst. Appl. 28(1), 1–8 (2005)

    Article  Google Scholar 

  114. J. Park, I.W. Sandberg, Universal approximation using radial-basis-function networks. Neural Comput. 3(2), 246–257 (1991)

    Article  Google Scholar 

  115. L.X. Wang, J.M. Mendel, Fuzzy basis functions, universal approximation, and orthogonal least-squares learning. IEEE Trans. Neural Networks 3(5), 807–814 (1992)

    Article  Google Scholar 

  116. P.P. Angelov, Evolving Rule-Based Models: A Tool for Design of Flexible Adaptive Systems (Springer, Berlin Heidelberg, 2002)

    Book  MATH  Google Scholar 

  117. Y. Yang, S. Newsam, Bag-of-visual-words and spatial extensions for land-use classification, in International Conference on Advances in Geographic Information Systems, 2010, pp. 270–279

    Google Scholar 

  118. L. Fei-Fei, R. Fergus, P. Perona, One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 594–611 (2006)

    Article  Google Scholar 

  119. P.Y. Simard, D. Steinkraus, J.C. Platt, Best practices for convolutional neural networks applied to visual document analysis, in Proceedings of Seventh International Conference on Document Analysis and Recognition, 2003, pp. 958–963

    Google Scholar 

  120. D.C. Cireşan, U. Meier, L.M. Gambardella, J. Schmidhuber, Convolutional neural network committees for handwritten character classification, in International Conference on Document Analysis and Recognition, vol. 10, , 2011, pp. 1135–1139

    Google Scholar 

  121. D. Ciresan, U. Meier, J. Schmidhuber, Multi-column deep neural networks for image classification, in Conference on Computer Vision and Pattern Recognition, 2012, pp. 3642–3649

    Google Scholar 

  122. Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2323 (1998)

    Article  Google Scholar 

  123. T.M. Lehmann, C. Gönner, K. Spitzer, Survey: interpolation methods in medical image processing. IEEE Trans. Med. Imaging 18(11), 1049–1075 (1999)

    Article  Google Scholar 

  124. P. Thevenaz, T. Blu, M. Unser, Interpolation revisited. IEEE Trans. Med. Imaging 19(7), 739–758 (2000)

    Article  Google Scholar 

  125. R. Keys, Cubic convolution interpolation for digital image processing. IEEE Trans. Acoust. 29(6), 1153–1160 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  126. J.W. Hwang, H.S. Lee, Adaptive image interpolation based on local gradient features. IEEE Signal Process. Lett. 11(3), 359–362 (2004)

    Article  Google Scholar 

  127. R.G. Casey, Moment Normalization of Handprinted Characters. IBM J. Res. Dev. 14(5), 548–557 (1970)

    Article  MATH  Google Scholar 

  128. http://weegee.vision.ucmerced.edu/datasets/landuse.html

  129. S.B. Park, J.W. Lee, S.K. Kim, Content-based image classification using a neural network. Pattern Recognit. Lett. 25(3), 287–300 (2004)

    Article  Google Scholar 

  130. G.-S. Xia, J. Hu, F. Hu, B. Shi, X. Bai, Y. Zhong, L. Zhang, AID: a benchmark dataset for performance evaluation of aerial scene classification. IEEE Trans. Geosci. Remote Sens. 55(7), 3965–3981 (2017)

    Article  Google Scholar 

  131. A. Oliva, A. Torralba, Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)

    Article  MATH  Google Scholar 

  132. N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005, pp. 886–893

    Google Scholar 

  133. D.G. Lowe, Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  MathSciNet  Google Scholar 

  134. M.J. Swain, D.H. Ballard, Color indexing. Int. J. Comput. Vis. 7(1), 11–32 (1991)

    Article  Google Scholar 

  135. P. Viola, M. Jones, Rapid object detection using a boosted cascade of simple features, in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, 2001, p. I-511–I-518

    Google Scholar 

  136. Y. Lin, F. Lv, S. Zhu, M. Yang, T. Cour, K. Yu, L. Cao, T. Huang, Large-scale image classification: Fast feature extraction and SVM training, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2011, pp. 1689–1696

    Google Scholar 

  137. M.M. El-Gayar, H. Soliman, N. Meky, A comparative study of image low level feature extraction algorithms. Egypt Inform. J. 14(2), 175–181 (2013)

    Article  Google Scholar 

  138. S. Lazebnik, C. Schmid, J. Ponce, Beyond bags of features : spatial pyramid matching for recognizing natural scene categories, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2006, pp. 2169–2178

    Google Scholar 

  139. J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, Y. Gong, Locality-constrained linear coding for image classification, in IEEE Conference on Computer Vision and Pattern Recognition, 2010, pp. 3360–3367

    Google Scholar 

  140. T. Joachims, Text categorization with support vector machines: learning with many relevant features, in European Conference on Machine Learning, 1998, pp. 137–142

    Google Scholar 

  141. X. Peng, L. Wang, X. Wang, Y. Qiao, Bag of visual words and fusion methods for action recognition: comprehensive study and good practice. Comput. Vis. Image Underst. 150, 109–125 (2015)

    Article  Google Scholar 

  142. H. Bay, T. Tuytelaars, L. Van Gool, SURF : Speeded ‐ Up Robust Features, in European Conference on Computer Vision, 2006, pp. 404–417

    Google Scholar 

  143. K. Graumanand, T. Darrell, The pyramid match kernel: discriminative classification with sets of image features, in International Conference on Computer Vision, 2005, pp. 1458–1465

    Google Scholar 

  144. S. Lazebnik, C. Schmid, J. Ponce, Spatial pyramid matching, in Object Categorization: Computer and Human Vision Perspectives, 2009, pp. 1–19

    Google Scholar 

  145. K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in International Conference on Learning Representations, 2015, pp. 1–14

    Google Scholar 

  146. Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, T. Darrell, Caffe: convolutional architecture for fast feature embedding∗, in ACM International Conference on Multimedia, 2014, pp. 675–678

    Google Scholar 

  147. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, C. Hill, A. Arbor, Going deeper with convolutions, in IEEE conference on computer vision and pattern recognition, 2015, pp. 1–9

    Google Scholar 

  148. A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet classification with deep convolutional neural networks, in Advances In Neural Information Processing Systems, 2012, pp. 1097–1105

    Google Scholar 

  149. M.D. Zeiler, R. Fergus, Visualizing and understanding convolutional networks, in European Conference on Computer Vision, 2014, pp. 818–833

    Google Scholar 

  150. A.B. Sargano, X. Wang, P. Angelov, Z. Habib, Human action recognition using transfer learning with deep representations, in IEEE International Joint Conference on Neural Networks (IJCNN), 2017, pp. 463–469

    Google Scholar 

  151. Q. Weng, Z. Mao, J. Lin, W. Guo, Land-use classification via extreme learning classifier based on deep convolutional features. IEEE Geosci. Remote Sens. Lett. 14(5), 704–708 (2017)

    Article  Google Scholar 

  152. G.J. Scott, M.R. England, W.A. Starms, R.A. Marcum, C.H. Davis, Training deep convolutional neural networks for land-cover classification of high-resolution imagery. IEEE Geosci. Remote Sens. Lett. 14(4), 549–553 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Plamen P. Angelov .

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Angelov, P.P., Gu, X. (2019). Brief Introduction to Statistical Machine Learning. In: Empirical Approach to Machine Learning. Studies in Computational Intelligence, vol 800. Springer, Cham. https://doi.org/10.1007/978-3-030-02384-3_2

Download citation

Publish with us

Policies and ethics