Supervised Learning Algorithms

Suthaharan, Shan

doi:10.1007/978-1-4899-7641-3_8

Supervised Learning Algorithms

Shan Suthaharan⁴

Chapter

14k Accesses
6 Citations

Part of the book series: Integrated Series in Information Systems ((ISIS,volume 36))

Abstract

Supervised learning algorithms help the learning models to be trained efficiently, so that they can provide high classification accuracy. In general, the supervised learning algorithms support the search for optimal values for the model parameters by using large data sets without overfitting the model. Therefore, a careful design of the learning algorithms with systematic approaches is essential. The machine learning field suggests three phases for the design of a supervised learning algorithm: training phase, validation phase, and testing phase. Hence, it recommends three divisions (or subsets) of the data sets to carry out these tasks. It also suggests defining or selecting suitable performance evaluation metrics to train, validate, and test the supervised learning models. Therefore, the objectives of this chapter are to discuss these three phases of a supervised learning algorithm and the three performance evaluation metrics called domain division, classification accuracy, and oscillation characteristics. The chapter objectives include the introduction of five new performance evaluation metrics called delayed learning, sporadic learning, deteriorate learning, heedless learning, and stabilized learning, which can help to measure classification accuracy under oscillation characteristics.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

S. B. Kotsiantis. “Supervised machine learning: A review of classification techniques,” Informatica 31, pp. 249–268, 2007.
MATH MathSciNet Google Scholar
C.M. Bishop. “Pattern recognition and machine learning,” Springer Science+Business Media, LLC, 2006.
MATH Google Scholar
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. New York: Springer, 2009.
Book MATH Google Scholar
https://cio.gov/performance-metrics-and-measures/ (last accessed April 22nd, 2015).
http://samate.nist.gov/index.php/Metrics_and_Measures.html (last accessed April 22nd, 2015).
T. G. Dietterich, “Machine-learning research: Four current directions,” AI Magazine, vol. 18, no. 4, pp. 97–136,1997.
Google Scholar
R. Kohavi. “A study of cross-validation and bootstrap for accuracy estimation and model selection,” International joint Conference on Artificial Intelligence (IJCAI), p. 7, 1995.
Google Scholar
L. Bottou, and Y. Lecun. “Large scale online learning,” Advances in Neural Information Processing Systems 16. Eds. S. Thurn, L. K. Saul, and B. Scholkopf. MIT Press, pp. 217–224, 2004.
Google Scholar
S. Arlot, and A. Celisse. “A survey of cross-validation procedures for model selection,” Statistics surveys, vol. 4, pp. 40–79, 2010.
Article MATH MathSciNet Google Scholar
A. Elisseeff and M. Pontil. “Leave-one-out error and stability of learning algorithms with applications,” NATO science series sub series iii computer and systems sciences, 190, pp. 111–130, 2003.
Google Scholar
H. Suominen, T. Pahikkala and T. Salakoski. “Critical points in assessing learning performance via cross-validation,” In Proceedings of the 2nd International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning, pp. 9–22, 2008.
Google Scholar
S. Suthaharan. “Big data classification: Problems and challenges in network intrusion prediction with machine learning,” ACM SIGMETRICS Performance Evaluation Review, vol. 41, no. 4, pp. 70–73, 2014.
Article Google Scholar
http://en.wikipedia.org/wiki/Pareto_principle
K. Macek. “Pareto principle in datamining: an above-average fencing algorithm,” Acta Polytechnica, vol. 48, no. 6, pp. 55–59, 2008.
Google Scholar
I. Guyon. “A scaling law for the validation-set training-set size ratio.” AT&T Bell Laboratories, pp.1–11, 1997.
Google Scholar
M. A. Hearst, S. T. Dumais, E. Osman, J. Platt, and B. Scholkopf. “Support vector machines.” Intelligent Systems and their Applications, IEEE, 13(4), pp. 18–28, 1998.
Article Google Scholar
O. L. Mangasarian and D. R. Musicant. 2000. “LSVM Software: Active set support vector machine classification software.” Available online at http://research.cs.wisc.edu/dmi/lsvm/.
L. Rokach, and O. Maimon. “Top-down induction of decision trees classifiers-a survey.” IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, vol. 35, no. 4, pp. 476–487, 2005.
Article Google Scholar
L. Breiman, “Random forests. “Machine learning 45, pp. 5–32, 2001.
Google Scholar
L. Breiman. “Bagging predictors.” Machine learning 24, pp. 123–140, 1996.
MATH MathSciNet Google Scholar
G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov. “Improving neural networks by preventing co-adaptation of feature detectors,” arXiv preprint arXiv:1207.0580, 2012.
Google Scholar
L. Wan, M. Zeiler, S. Zhang, Y. LeCun, and R. Fergus. “Regularization of neural networks using dropconnect.” In Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 1058–1066, 2013.
Google Scholar
I.H. Witten, E. Frank, and M.A. Hall. Data Mining – Practical machine learning tools and techniques. Morgan Kaufmann, 3rd Edition, 2011.
Google Scholar
Machine Learning Corner (Design models that learn from data), “Evaluation of Classifier’s Performance,” https://mlcorner.wordpress.com/tag/specificity/, Posted on April 30, 2013 (last accessed April 22nd, 2015).
G. M. Weiss, and F. Provost. “Learning when training data are costly: the effect of class distribution on tree induction,” Journal of Artificial Intelligence Research, vol. 19, pp. 315–354, 2003.
MATH Google Scholar

Download references

Acknowledgements

The oscillation-based measures have been developed during my visit to University of California, Berkeley in Fall 2013. I take this opportunity to thank Professor Bin Yu for her financial support and valuable discussions.

Author information

Authors and Affiliations

Department of Computer Science, UNC Greensboro, Greensboro, NC, USA
Shan Suthaharan

Authors

Shan Suthaharan
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Suthaharan, S. (2016). Supervised Learning Algorithms. In: Machine Learning Models and Algorithms for Big Data Classification. Integrated Series in Information Systems, vol 36. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7641-3_8

Download citation

DOI: https://doi.org/10.1007/978-1-4899-7641-3_8
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4899-7640-6
Online ISBN: 978-1-4899-7641-3
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics