Skip to main content

Supervised Learning

  • Chapter

Part of the book series: Cognitive Technologies ((COGTECH))

Abstract

Supervised learning accounts for a lot of research activity in machine learning and many supervised learning techniques have found application in the processing of multimedia content. The defining characteristic of supervised learning is the availability of annotated training data. The name invokes the idea of a ‘supervisor’ that instructs the learning system on the labels to associate with training examples. Typically these labels are class labels in classification problems. Supervised learning algorithms induce models from these training data and these models can be used to classify other unlabelled data. In this chapter we ground or analysis of supervised learning on the theory of risk minimization. We provide an overview of support vector machines and nearest neighbour classifiers~– probably the two most popular supervised learning techniques employed in multimedia research.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. D. W. Aha, D. Kibler, and M. K. Albert. Instance-based learning algorithms. Machine Learning, 6:37–66, 1991.

    Google Scholar 

  2. R. Avnimelech and N. Intrator. Boosted mixture of experts: An ensemble learning scheme. Neural Computation, 11(2):483–497, 1999.

    Article  Google Scholar 

  3. A. Beygelzimer, S. Kakade, and J. Langford. Cover trees for nearest neighbor. In Proceedings of 23rd International Conference on Machine Learning (ICML 2006), 2006.

    Google Scholar 

  4. L. Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996.

    MATH  MathSciNet  Google Scholar 

  5. L. Breiman. Random forests. Machine Learning, 45(1):5–32, 2001.

    Article  MATH  Google Scholar 

  6. H. Brighton and C. Mellish. Advances in instance selection for instance-based learning algorithms. Data Mining and Knowledge Discovery, 6(2):153–172, 2002.

    Article  MATH  MathSciNet  Google Scholar 

  7. C. Brodley. Addressing the selective superiority problem: Automatic algorithm/mode class selection. In Proceedings of the 10th International Conference on Machine Learning (ICML 93), pages 17–24. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1993.

    Google Scholar 

  8. R. M. Cameron-Jones. Minimum description length instance-based learning. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, pages 368–373. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1992.

    Google Scholar 

  9. Marquis J. A. Condorcet. Sur les elections par scrutiny. Histoire de l’Academie Royale des Sciences, 31–34, 1781.

    Google Scholar 

  10. N. Cristianini and J. Shawe-Taylor. An introduction to support vector machines. Cambridge University Press, Cambridge, 2000.

    Google Scholar 

  11. P. Cunningham and J. Carney. Diversity versus quality in classification ensembles based on feature selection. In Ramon López de Mántaras and Enric Plaza, editors, Machine Learning: ECML 2000, 11th European Conference on Machine Learning, Barcelona, Catalonia, Spain, May 31–June 2, 2000, Proceedings, pages 109–116. Springer, New York, 2000.

    Google Scholar 

  12. S.J. Delany and D. Bridge. Feature-based and feature-free textual cbr: A comparison in spam filtering. In D. Bell, P. Milligan, and P. Sage, editors, Proceedings of the 17th Irish Conference on Artificial Intelligence and Cognitive Science (AICS’06), pages 244–253, 2006.

    Google Scholar 

  13. S.J. Delany and P. Cunningham. An analysis of case-base editing in a spam filtering system. In 7th European Conference on Case-Based Reasoning. Springer Verlag, New York, 2004.

    Google Scholar 

  14. H. Drucker. Improving regressors using boosting techniques. In D. H. Fisher, editor, Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), Nashville, Tennessee, USA, July 8–12, 1997, pages 107–115. Morgan Kaufmann, San Francisco, CA, USA, 1997.

    Google Scholar 

  15. S. Esmeir and S. Markovitch. Anytime induction of decision trees: An iterative improvement approach. In AAAI. AAAI Press, Menlo Park, CA, USA, 2006.

    Google Scholar 

  16. G. W. Gates. The reduced nearest neighbor rule. IEEE Transactions on Information Theory, 18(3):431–433, 1972.

    Google Scholar 

  17. L. K. Hansen and P. Salamon. Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10):993–1001, 1990.

    Article  Google Scholar 

  18. P. E. Hart. The condensed nearest neighbor rule. IEEE Transactions on Information Theory, 14(3):515–516, 1968.

    Article  Google Scholar 

  19. T. K. Ho. Nearest neighbors in random subspaces. In Adnan Amin, Dov Dori, Pavel Pudil, and Herbert Freeman, editors, Advances in Pattern Recognition, Joint IAPR International Workshops SSPR ’98 and SPR ’98, Sydney, NSW, Australia, August 11–13, 1998, Proceedings, pages 640–648. Springer, New York, 1998.

    Google Scholar 

  20. T. K. Ho. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8):832–844, 1998.

    Article  Google Scholar 

  21. E. J. Keogh, S. Lonardi, and C. Ratanamahatana. Towards parameter-free data mining. In W. Kim, R. Kohavi, J. Gehrke, and W. DuMouchel, editors, KDD, pages 206–215. ACM, New York, Ny, USA, 2004.

    Google Scholar 

  22. R. Kohavi and D. Wolpert. Bias plus variance decomposition for zero–one loss functions. In ICML, pages 275–283. Morgan Kaufmann, 1996.

    Google Scholar 

  23. A. Krogh and J. Vedelsby. Neural network ensembles, cross validation, and active learning. In Gerald Tesauro, David S. Touretzky, and Todd K. Leen, editors, Advances in Neural Information Processing Systems 7, [NIPS Conference, Denver, Colorado, USA, 1994], pages 231–238. MIT Press, Cambridge, MA, USA, 1994.

    Google Scholar 

  24. S. Kullback and R. A. Leibler. On information and sufficiency. Annals of Mathematical Statistics, 22:79–86, 1951.

    Article  MathSciNet  MATH  Google Scholar 

  25. L. I. Kuncheva and C. J. Whitaker. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51(2):181–207, 2003.

    Article  MATH  Google Scholar 

  26. M. Lenz and H-D. Burkhard. Case retrieval nets: Basic ideas and extensions. In KI - Kunstliche Intelligenz, pages 227–239, 1996.

    Google Scholar 

  27. M. Lenz, H.-D.Burkhard, and S. Brückner. Applying case retrieval nets to diagnostic tasks in technical domains. In Ian F. C. Smith and Boi Faltings, editors, EWCBR, volume 1168 of Lecture Notes in Computer Science, pages 219–233. Springer, New York, 1996.

    Google Scholar 

  28. M. Li, X. Chen, X. Li, B. Ma, and P. M. B. Vitányi. The similarity metric. IEEE Transactions on Information Theory, 50(12):3250–3264, 2004.

    Article  Google Scholar 

  29. E. McKenna and B. Smyth. Competence-guided editing methods for lazy learning. In W. Horn, editor, ECAI 2000, Proceedings of the 14th European Conference on Artificial Intelligence, pages 60–64. IOS Press, The Netherlands 2000.

    Google Scholar 

  30. S.I. Nitzan and J. Paroush. Collective Decision Making. Cambridge University Press, Cambridge, 1985.

    Google Scholar 

  31. G. L. Ritter, H. B. Woodruff, S. R. Lowry, and T. L. Isenhour. An algorithm for a selective nearest neighbor decision rule. IEEE Transactions on Information Theory, 21(6):665–669, 1975.

    Article  MATH  Google Scholar 

  32. Y. Rubner, L. J. Guibas, and C. Tomasi. The earth mover’s distance, multi-dimensional scaling, and color-based image retrieval. In Proceedings of the ARPA Image Understanding Workshop, pages 661–668, 1997.

    Google Scholar 

  33. Y. Rubner, C. Tomasi, and L. J. Guibas. The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision, 40(2):99–121, 2000.

    Article  MATH  Google Scholar 

  34. J.W. Schaaf. Fish and Shrink. A next step towards efficient case retrieval in large-scale case bases. In I. Smith and B. Faltings, editors, European Conference on Case-Based Reasoning (WCBR’96, pages 362–376. Springer, New York, 1996.

    Google Scholar 

  35. R. E. Schapire. A brief introduction to boosting. In T. Dean, editor, Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, IJCAI 99, Stockholm, Sweden, July 31–August 6, 1999. 2 Volumes, 1450 pages, pages 1401–1406. Morgan Kaufmann, San Francisco, CA, USA, 1999.

    Google Scholar 

  36. B. Schölkopf and A. Smola. Learning with Kernels. MIT Press, Cambridge, MA, 2002.

    Google Scholar 

  37. J. Shawe-Taylor and N. Cristianini. Kernel methods for Pattern Analysis. Cambridge University Press, Cambridge ISBN 0-521-81397-2, 2004.

    Google Scholar 

  38. R. N. Shepard. Toward a universal law of generalization for psychological science. Science, 237:1317–1228, 1987.

    Google Scholar 

  39. B. Smyth and M. Keane. Remembering to forget: A competence preserving case deletion policy for cbr system. In C. Mellish, editor, Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, IJCAI (1995), pages 337–382. Morgan Kaufmann, San Francisco, CA, USA, 1995.

    Google Scholar 

  40. B. Smyth and E. McKenna. Footprint-based retrieval. In Klaus-Dieter Althoff, Ralph Bergmann, and Karl Branting, editors, ICCBR, volume 1650 of Lecture Notes in Computer Science, pages 343–357. Springer, New York, 1999.

    Google Scholar 

  41. I. Tomek. An experiment with the nearest neighbor rule. IEEE Transactions on Information Theory, 6(6):448–452, 1976.

    MATH  MathSciNet  Google Scholar 

  42. S. Tong. Active Learning: Theory and Applications. PhD thesis, Stanford University, 2001.

    Google Scholar 

  43. A. Tsymbal, M. Pechenizkiy, and P. Cunningham. Diversity in random subspacing ensembles. In Yahiko Kambayashi, Mukesh K. Mohania, and Wolfram Wöß, editors, DaWaK, volume 3181 of Lecture Notes in Computer Science, pages 309–319. Springer, New York, 2004.

    Google Scholar 

  44. L. G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134–42, 1984.

    Article  MATH  Google Scholar 

  45. V. Vapnik. Statistical Learning Theory. John Wiley, New York, 1998.

    MATH  Google Scholar 

  46. V. N. Vapnik and A.Y. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16(2):264–280, 1971.

    Article  MathSciNet  MATH  Google Scholar 

  47. K. Veropoulos. Controlling the sensivity of support vector machines. In International Joint Conference on Artificial Intelligence (IJCAI99), Stockholm, Sweden, 1999.

    Google Scholar 

  48. L. Wang. Image retrieval with svm active learning embedding euclidean search. In IEEE International Conference on Image Processing, Barcelona, September 2003.

    Google Scholar 

  49. D. Wilson and T. Martinez. Instance pruning techniques. In ICML ’97: Proceedings of the Fourteenth International Conference on Machine Learning, pages 403–411. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1997.

    Google Scholar 

  50. D. L. Wilson. Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man and Cybernetics, 2(3):408–421, 1972.

    Article  MATH  Google Scholar 

  51. D. H. Wolpert The lack of a priori distinctions between learning algorithms. In Neural Computation, 7, pages 1341–1390, 1996.

    Google Scholar 

  52. J. Zhang. Selecting typical instances in instance-based learning. In Proceedings of the 9th International Conference on Machine Learning (ICML 92), pages 470–479. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1992.

    Google Scholar 

  53. J. Zu and Q. Yang. Remembering to add: competence preserving case-addition policies for case-base maintenance. In Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI 97), pages 234–239. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Cunningham, P., Cord, M., Delany, S.J. (2008). Supervised Learning. In: Cord, M., Cunningham, P. (eds) Machine Learning Techniques for Multimedia. Cognitive Technologies. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75171-7_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-75171-7_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-75170-0

  • Online ISBN: 978-3-540-75171-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics