Skip to main content

Supervised Learning Algorithms

  • Chapter

Part of the book series: Integrated Series in Information Systems ((ISIS,volume 36))

Abstract

Supervised learning algorithms help the learning models to be trained efficiently, so that they can provide high classification accuracy. In general, the supervised learning algorithms support the search for optimal values for the model parameters by using large data sets without overfitting the model. Therefore, a careful design of the learning algorithms with systematic approaches is essential. The machine learning field suggests three phases for the design of a supervised learning algorithm: training phase, validation phase, and testing phase. Hence, it recommends three divisions (or subsets) of the data sets to carry out these tasks. It also suggests defining or selecting suitable performance evaluation metrics to train, validate, and test the supervised learning models. Therefore, the objectives of this chapter are to discuss these three phases of a supervised learning algorithm and the three performance evaluation metrics called domain division, classification accuracy, and oscillation characteristics. The chapter objectives include the introduction of five new performance evaluation metrics called delayed learning, sporadic learning, deteriorate learning, heedless learning, and stabilized learning, which can help to measure classification accuracy under oscillation characteristics.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. S. B. Kotsiantis. “Supervised machine learning: A review of classification techniques,” Informatica 31, pp. 249–268, 2007.

    MATH  MathSciNet  Google Scholar 

  2. C.M. Bishop. “Pattern recognition and machine learning,” Springer Science+Business Media, LLC, 2006.

    MATH  Google Scholar 

  3. T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. New York: Springer, 2009.

    Book  MATH  Google Scholar 

  4. https://cio.gov/performance-metrics-and-measures/ (last accessed April 22nd, 2015).

  5. http://samate.nist.gov/index.php/Metrics_and_Measures.html (last accessed April 22nd, 2015).

  6. T. G. Dietterich, “Machine-learning research: Four current directions,” AI Magazine, vol. 18, no. 4, pp. 97–136,1997.

    Google Scholar 

  7. R. Kohavi. “A study of cross-validation and bootstrap for accuracy estimation and model selection,” International joint Conference on Artificial Intelligence (IJCAI), p. 7, 1995.

    Google Scholar 

  8. L. Bottou, and Y. Lecun. “Large scale online learning,” Advances in Neural Information Processing Systems 16. Eds. S. Thurn, L. K. Saul, and B. Scholkopf. MIT Press, pp. 217–224, 2004.

    Google Scholar 

  9. S. Arlot, and A. Celisse. “A survey of cross-validation procedures for model selection,” Statistics surveys, vol. 4, pp. 40–79, 2010.

    Article  MATH  MathSciNet  Google Scholar 

  10. A. Elisseeff and M. Pontil. “Leave-one-out error and stability of learning algorithms with applications,” NATO science series sub series iii computer and systems sciences, 190, pp. 111–130, 2003.

    Google Scholar 

  11. H. Suominen, T. Pahikkala and T. Salakoski. “Critical points in assessing learning performance via cross-validation,” In Proceedings of the 2nd International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning, pp. 9–22, 2008.

    Google Scholar 

  12. S. Suthaharan. “Big data classification: Problems and challenges in network intrusion prediction with machine learning,” ACM SIGMETRICS Performance Evaluation Review, vol. 41, no. 4, pp. 70–73, 2014.

    Article  Google Scholar 

  13. http://en.wikipedia.org/wiki/Pareto_principle

  14. K. Macek. “Pareto principle in datamining: an above-average fencing algorithm,” Acta Polytechnica, vol. 48, no. 6, pp. 55–59, 2008.

    Google Scholar 

  15. I. Guyon. “A scaling law for the validation-set training-set size ratio.” AT&T Bell Laboratories, pp.1–11, 1997.

    Google Scholar 

  16. M. A. Hearst, S. T. Dumais, E. Osman, J. Platt, and B. Scholkopf. “Support vector machines.” Intelligent Systems and their Applications, IEEE, 13(4), pp. 18–28, 1998.

    Article  Google Scholar 

  17. O. L. Mangasarian and D. R. Musicant. 2000. “LSVM Software: Active set support vector machine classification software.” Available online at http://research.cs.wisc.edu/dmi/lsvm/.

  18. L. Rokach, and O. Maimon. “Top-down induction of decision trees classifiers-a survey.” IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, vol. 35, no. 4, pp. 476–487, 2005.

    Article  Google Scholar 

  19. L. Breiman, “Random forests. “Machine learning 45, pp. 5–32, 2001.

    Google Scholar 

  20. L. Breiman. “Bagging predictors.” Machine learning 24, pp. 123–140, 1996.

    MATH  MathSciNet  Google Scholar 

  21. G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov. “Improving neural networks by preventing co-adaptation of feature detectors,” arXiv preprint arXiv:1207.0580, 2012.

    Google Scholar 

  22. L. Wan, M. Zeiler, S. Zhang, Y. LeCun, and R. Fergus. “Regularization of neural networks using dropconnect.” In Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 1058–1066, 2013.

    Google Scholar 

  23. I.H. Witten, E. Frank, and M.A. Hall. Data Mining – Practical machine learning tools and techniques. Morgan Kaufmann, 3rd Edition, 2011.

    Google Scholar 

  24. Machine Learning Corner (Design models that learn from data), “Evaluation of Classifier’s Performance,” https://mlcorner.wordpress.com/tag/specificity/, Posted on April 30, 2013 (last accessed April 22nd, 2015).

  25. G. M. Weiss, and F. Provost. “Learning when training data are costly: the effect of class distribution on tree induction,” Journal of Artificial Intelligence Research, vol. 19, pp. 315–354, 2003.

    MATH  Google Scholar 

Download references

Acknowledgements

The oscillation-based measures have been developed during my visit to University of California, Berkeley in Fall 2013. I take this opportunity to thank Professor Bin Yu for her financial support and valuable discussions.

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this chapter

Cite this chapter

Suthaharan, S. (2016). Supervised Learning Algorithms. In: Machine Learning Models and Algorithms for Big Data Classification. Integrated Series in Information Systems, vol 36. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7641-3_8

Download citation

Publish with us

Policies and ethics