Abstract
Classification modeling is one of the methods commonly employed for predictive data mining. Ensemble classification is concerned with the creation of many base models which are combined into one model for purposes of increasing classification performance. This paper reports on a study which was conducted to establish whether the use of information in the confusion matrix of a single classification model could be used as a basis for the design of ensemble base models that provide high predictive performance. Positive-versus-negative (pVn) classification was studied as a method of base model design. Confusion graphs were used as input to an algorithm that determines the classes for each base model. Experiments were conducted to compare the levels of diversity provided by all-classes-at-once (ACA) and pVn base models using a statistical measure of dis-similarity. Experiments were also conducted to compare the performance of pVn ensembles, ACA ensembles, and single k-class models using classification trees and multi-layer perceptron artificial neural networks. The experimental results demonstrated that even though ACA base models provide a higher level of diversity than pVn base models, the diversity does result in higher predictive performance. The experimental results also demonstrated that pVn ensemble models can provide predictive performance that is higher than that of single k-class models and ACA ensemble models.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Giudici, P.: Applied Data Mining: Statistical Methods for Business and Industry. John Wiley & Sons, Chichester (2003)
Hansen, L.K., Salamon, P.: Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence 12(10), 993–1001 (1990)
Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)
Kwok, S.W., Carter, C.: Multiple decision trees. Uncertainty in Artificial Intelligence 4, 327–335 (1990)
Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation and active learning. In: Tesauro, G., Touretzky, D.S., Leen, T.K. (eds.) Advances in Neural Information Processing Systems. MIT Press, Cambridge (1995)
Osei-Bryson, K.-M., Kah, M.O., Kah, J.M.L.: Selecting predictive models for inclusion in an ensemble. In: The 18th Triennial Conference of the International Federation of Operational Research Societies (IFORS 2008), Sandton, Johannesburg (July 2008)
Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with ensemble accuracy. Machine Learning 51, 181–207 (2003)
Lutu, P.E.N.: Dataset Selection for Aggregate Model Implementation in Predictive Data Mining. PhD thesis, Department of Computer Science. University of Pretoria (2010), http://upetd.up.ac.za/thesis/available/etd-11152010-203041/
Ali, K.M., Pazzani, J.: Error reduction through learning multiple descriptions. Machine Learning 24, 173–202 (1996)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)
Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms. John Wiley & sons, Hoboken (2004)
Dietterich, T.G.: Ensemble methods in machine learning. In: Proc. First International Workshop on Multiple Classifier Systems. Springer, Heidelberg (2000)
Freund, Y., Schapire, R.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)
Ho, T.K.: Random decision forests. In: Proc. Third International Conference on Document Analysis and Recognition, Montreal, Canada (August 1995)
Bishop, C.M.: Neural Network for Pattern Recognition. Clarendon Press, Oxford (1995)
Rifkin, R., Klautau, A.: In defense of one-vs-all classification. The Journal of Machine Learning Research 5, 101–141 (2004)
Galar, M., Fernández, Z., Barenenchea, E., Bustince, H., Herrara, F.: An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study of one-vs-one and one-vs-all schemes. Pattern Recognition 44, 1761–1776 (2011)
Fürnkranz, J.: Pairwise classification as an ensemble technique. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 97–110. Springer, Heidelberg (2002)
Dietterich, T., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research 2, 263–286 (1995)
Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixture of local experts. Neural Computation 3(1), 79–87 (1991)
Kittler, J.: Combining classifiers: A theoretical framework. Pattern Analysis and Applications 1, 18–27 (1998)
Hettich, S., Bay, S.D.: The UCI KDD archive. Department of Information and Computer Science. University of California, Irvine (1999), http://kdd.ics.uci.edu
Shin, S.W., Lee, C.H.: Using Attack-Specific Feature Subsets for Network Intrusion Detection. In: Proceedings of the 19th Australian Conference on Artificial Intelligence, Hobart, Australia (2006)
Laskov, P., Düssel, P., Schäfer, C., Rieck, K.: Learning intrusion detection: supervised or unsupervised? In: ICAP: International Conference on Image Analysis and Processing, Cagliari, Italy (2005)
Lutu, P.E.N., Engelbrecht, A.P.: A decision rule-based method for feature selection in predictive data mining. Expert Systems with Applications 37(1), 602–609 (2010)
Quinlan, J.R.: An Informal Tutorial, Rulequest Research (2004), http://www.rulequest.com (accessed October 28, 2005)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kauffman, San Francisco (1993)
Fawcett, T.: ROC graphs: Notes and practical considerations for researchers. HP Laboratories (2004), http://home.comcast.net/~tom.fawcett/public_html/papers/ROC101.pdf (Cited March 1, 2010)
Fawcett, T.: An introduction to ROC analysis. Pattern Recognition Letters 27, 861–874 (2006)
Hand, D.J., Till, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning 45, 171–186 (2001)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth & Brooks, Pacific Grove (1984)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lutu, P.E.N. (2011). Using Confusion Matrices and Confusion Graphs to Design Ensemble Classification Models from Large Datasets. In: Cuzzocrea, A., Dayal, U. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2011. Lecture Notes in Computer Science, vol 6862. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23544-3_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-23544-3_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23543-6
Online ISBN: 978-3-642-23544-3
eBook Packages: Computer ScienceComputer Science (R0)