Skip to main content

Using Confusion Matrices and Confusion Graphs to Design Ensemble Classification Models from Large Datasets

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6862))

Abstract

Classification modeling is one of the methods commonly employed for predictive data mining. Ensemble classification is concerned with the creation of many base models which are combined into one model for purposes of increasing classification performance. This paper reports on a study which was conducted to establish whether the use of information in the confusion matrix of a single classification model could be used as a basis for the design of ensemble base models that provide high predictive performance. Positive-versus-negative (pVn) classification was studied as a method of base model design. Confusion graphs were used as input to an algorithm that determines the classes for each base model. Experiments were conducted to compare the levels of diversity provided by all-classes-at-once (ACA) and pVn base models using a statistical measure of dis-similarity. Experiments were also conducted to compare the performance of pVn ensembles, ACA ensembles, and single k-class models using classification trees and multi-layer perceptron artificial neural networks. The experimental results demonstrated that even though ACA base models provide a higher level of diversity than pVn base models, the diversity does result in higher predictive performance. The experimental results also demonstrated that pVn ensemble models can provide predictive performance that is higher than that of single k-class models and ACA ensemble models.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Giudici, P.: Applied Data Mining: Statistical Methods for Business and Industry. John Wiley & Sons, Chichester (2003)

    MATH  Google Scholar 

  2. Hansen, L.K., Salamon, P.: Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence 12(10), 993–1001 (1990)

    Article  Google Scholar 

  3. Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)

    MATH  Google Scholar 

  4. Kwok, S.W., Carter, C.: Multiple decision trees. Uncertainty in Artificial Intelligence 4, 327–335 (1990)

    Article  Google Scholar 

  5. Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation and active learning. In: Tesauro, G., Touretzky, D.S., Leen, T.K. (eds.) Advances in Neural Information Processing Systems. MIT Press, Cambridge (1995)

    Google Scholar 

  6. Osei-Bryson, K.-M., Kah, M.O., Kah, J.M.L.: Selecting predictive models for inclusion in an ensemble. In: The 18th Triennial Conference of the International Federation of Operational Research Societies (IFORS 2008), Sandton, Johannesburg (July 2008)

    Google Scholar 

  7. Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with ensemble accuracy. Machine Learning 51, 181–207 (2003)

    Article  MATH  Google Scholar 

  8. Lutu, P.E.N.: Dataset Selection for Aggregate Model Implementation in Predictive Data Mining. PhD thesis, Department of Computer Science. University of Pretoria (2010), http://upetd.up.ac.za/thesis/available/etd-11152010-203041/

  9. Ali, K.M., Pazzani, J.: Error reduction through learning multiple descriptions. Machine Learning 24, 173–202 (1996)

    Google Scholar 

  10. Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)

    Article  Google Scholar 

  11. Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms. John Wiley & sons, Hoboken (2004)

    Book  MATH  Google Scholar 

  12. Dietterich, T.G.: Ensemble methods in machine learning. In: Proc. First International Workshop on Multiple Classifier Systems. Springer, Heidelberg (2000)

    Google Scholar 

  13. Freund, Y., Schapire, R.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)

    Article  MATH  Google Scholar 

  14. Ho, T.K.: Random decision forests. In: Proc. Third International Conference on Document Analysis and Recognition, Montreal, Canada (August 1995)

    Google Scholar 

  15. Bishop, C.M.: Neural Network for Pattern Recognition. Clarendon Press, Oxford (1995)

    Google Scholar 

  16. Rifkin, R., Klautau, A.: In defense of one-vs-all classification. The Journal of Machine Learning Research 5, 101–141 (2004)

    MATH  Google Scholar 

  17. Galar, M., Fernández, Z., Barenenchea, E., Bustince, H., Herrara, F.: An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study of one-vs-one and one-vs-all schemes. Pattern Recognition 44, 1761–1776 (2011)

    Article  Google Scholar 

  18. Fürnkranz, J.: Pairwise classification as an ensemble technique. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 97–110. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  19. Dietterich, T., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research 2, 263–286 (1995)

    MATH  Google Scholar 

  20. Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixture of local experts. Neural Computation 3(1), 79–87 (1991)

    Article  Google Scholar 

  21. Kittler, J.: Combining classifiers: A theoretical framework. Pattern Analysis and Applications 1, 18–27 (1998)

    Article  Google Scholar 

  22. Hettich, S., Bay, S.D.: The UCI KDD archive. Department of Information and Computer Science. University of California, Irvine (1999), http://kdd.ics.uci.edu

    Google Scholar 

  23. Shin, S.W., Lee, C.H.: Using Attack-Specific Feature Subsets for Network Intrusion Detection. In: Proceedings of the 19th Australian Conference on Artificial Intelligence, Hobart, Australia (2006)

    Google Scholar 

  24. Laskov, P., Düssel, P., Schäfer, C., Rieck, K.: Learning intrusion detection: supervised or unsupervised? In: ICAP: International Conference on Image Analysis and Processing, Cagliari, Italy (2005)

    Google Scholar 

  25. Lutu, P.E.N., Engelbrecht, A.P.: A decision rule-based method for feature selection in predictive data mining. Expert Systems with Applications 37(1), 602–609 (2010)

    Article  Google Scholar 

  26. Quinlan, J.R.: An Informal Tutorial, Rulequest Research (2004), http://www.rulequest.com (accessed October 28, 2005)

  27. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kauffman, San Francisco (1993)

    Google Scholar 

  28. Fawcett, T.: ROC graphs: Notes and practical considerations for researchers. HP Laboratories (2004), http://home.comcast.net/~tom.fawcett/public_html/papers/ROC101.pdf (Cited March 1, 2010)

  29. Fawcett, T.: An introduction to ROC analysis. Pattern Recognition Letters 27, 861–874 (2006)

    Article  Google Scholar 

  30. Hand, D.J., Till, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning 45, 171–186 (2001)

    Article  MATH  Google Scholar 

  31. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth & Brooks, Pacific Grove (1984)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lutu, P.E.N. (2011). Using Confusion Matrices and Confusion Graphs to Design Ensemble Classification Models from Large Datasets. In: Cuzzocrea, A., Dayal, U. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2011. Lecture Notes in Computer Science, vol 6862. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23544-3_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23544-3_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23543-6

  • Online ISBN: 978-3-642-23544-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics