Skip to main content

Classification

  • Chapter
  • First Online:
Data Analytics

Abstract

Classification is supervised learning that uses labeled data to assign objects to classes. We distinguish false positive and false negative errors and define numerous indicators to quantify classifier performance. Pairs of indicators are considered to assess classification performance.We illustrate this with the receiver operating characteristic and the precision recall diagram. Several different classifiers with specific features and drawbacks are presented in detail: the naive Bayes classifier, linear discriminant analysis, the support vector machine (SVM) using the kernel trick, nearest neighbor classifiers, learning vector quantification, and hierarchical classification using regression trees.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 29.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. D. W. Aha. Editorial: Lazy learning. Artificial Intelligence Review (Special Issue on Lazy Learning), 11(1–5):7–10, June 1997.

    Google Scholar 

  2. R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley, New York, 1999.

    Google Scholar 

  3. T. Bayes. An essay towards solving a problem in the doctrine of chances. Philosophical Transactions of the Royal Society of London, 53:370–418, 1763.

    Article  Google Scholar 

  4. J. C. Bezdek and N. R. Pal. Two soft relatives of learning vector quantization. Neural Networks, 8(5):729–743, 1995.

    Article  Google Scholar 

  5. J. C. Bezdek, T. R. Reichherzer, G. S. Lim, and Y. Attikiouzel. Multiple-prototype classifier design. IEEE Transactions on Systems, Man, and Cybernetics C, 28(1):67–79, 1998.

    Google Scholar 

  6. L. Breiman, J. H. Friedman, R. A. Olsen, and C. J. Stone. Classification and Regression Trees. Chapman & Hall, New Work, 1984.

    MATH  Google Scholar 

  7. F. L. Chung and T. Lee. Fuzzy learning vector quantization. In IEEE International Joint Conference on Neural Networks, volume 3, pages 2739–2743, Nagoya, October 1993.

    Google Scholar 

  8. J. Davis and M. Goadrich. The relationship between precision–recall and ROC curves. In International Conference on Machine Learning, pages 233–240, 2006.

    Google Scholar 

  9. R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. Wiley, New York, 1974.

    Google Scholar 

  10. R. A. Fisher. The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7:179–188, 1936.

    Google Scholar 

  11. G. V. Kass. Significance testing in automatic interaction detection (AID). Applied Statistics, 24:178–189, 1975.

    Article  Google Scholar 

  12. T. Kohonen. Learning vector quantization. Neural Networks, 1:303, 1988.

    Article  Google Scholar 

  13. T. Kohonen. Improved versions of learning vector quantization. In International Joint Conference on Neural Networks, volume 1, pages 545–550, San Diego, June 1990.

    Google Scholar 

  14. J. Mercer. Functions of positive and negative type and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society A, 209:415–446, 1909.

    Article  MATH  Google Scholar 

  15. J. Neyman and E. S. Pearson. Interpretation of certain test criteria for purposes of statistical inference, part I. Joint Statistical Papers, Cambridge University Press, pages 1–66, 1967.

    Google Scholar 

  16. M. J. D. Powell. Radial basis functions for multi–variable interpolation: a review. In IMA Conference on Algorithms for Approximation of Functions and Data, pages 143–167, Shrivenham, 1985.

    Google Scholar 

  17. J. R. Quinlan. Induction on decision trees. Machine Learning, 11:81–106, 1986.

    Google Scholar 

  18. L. Rokach and O. Maimon. Data Mining with Decision Trees: Theory and Applications. Machine Perception and Artificial Intelligence. World Scientific Publishing Company, 2008.

    Google Scholar 

  19. B. Schölkopf and A. J. Smola. Learning with Kernels. MIT Press, Cambridge, 2002.

    Google Scholar 

  20. G. Shakhnarovich, T. Darrell, and P. Indyk. Nearest–Neighbor Methods in Learning and Vision: Theory and Practice. Neural Information Processing. MIT Press, 2006.

    Google Scholar 

  21. S. Theodoridis and K. Koutroumbas. Pattern Recognition. Academic Press, San Diego, 4th edition, 2008.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas A. Runkler .

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Vieweg+Teubner Verlag | Springer Fachmedien Wiesbaden

About this chapter

Cite this chapter

Runkler, T. (2012). Classification. In: Data Analytics. Vieweg+Teubner Verlag, Wiesbaden. https://doi.org/10.1007/978-3-8348-2589-6_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-8348-2589-6_8

  • Published:

  • Publisher Name: Vieweg+Teubner Verlag, Wiesbaden

  • Print ISBN: 978-3-8348-2588-9

  • Online ISBN: 978-3-8348-2589-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics