Correlation-Based and Causal Feature Selection Analysis for Ensemble Classifiers

Duangsoithong, Rakkrit; Windeatt, Terry

doi:10.1007/978-3-642-12159-3_3

Rakkrit Duangsoithong²¹ &
Terry Windeatt²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5998))

Included in the following conference series:

IAPR Workshop on Artificial Neural Networks in Pattern Recognition

3458 Accesses
7 Citations

Abstract

High dimensional feature spaces with relatively few samples usually leads to poor classifier performance for machine learning, neural networks and data mining systems. This paper presents a comparison analysis between correlation-based and causal feature selection for ensemble classifiers. MLP and SVM are used as base classifier and compared with Naive Bayes and Decision Tree. According to the results, correlation-based feature selection algorithm can eliminate more redundant and irrelevant features, provides slightly better accuracy and less complexity than causal feature selection. Ensemble using Bagging algorithm can improve accuracy in both correlation-based and causal feature selection.

Download to read the full chapter text

Chapter PDF

Building Weighted Classifier Ensembles Through Classifiers Pruning

A New Multi-classifier Ensemble Algorithm Based on D-S Evidence Theory

Article 09 June 2022

Features’ Associations in Fuzzy Ensemble Classifiers

Keywords

References

Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering 17(4), 491–502 (2005)
Article Google Scholar
Saeys, Y., Inza, I., Larranaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)
Article Google Scholar
Duangsoithong, R., Windeatt, T.: Relevance and Redundancy Analysis for Ensemble Classifiers. In: Perner, P. (ed.) Machine Learning and Data Mining in Pattern Recognition. LNCS, vol. 5632, pp. 206–220. Springer, Heidelberg (2009)
Chapter Google Scholar
Guyon, I., Aliferis, C., Elisseeff, A.: Causal Feature Selection. In: Liu, H., Motoda, H. (eds.) Computational Methods of Feature Selection. Chapman and Hall, Boca Raton (2007)
Google Scholar
Aliferis, C.F., Tsamardinos, I., Statnikov, A.: HITON, A Novel Markov Blanket Algorithm for Optimal Variable Selection. In: AMIA 2003 Annual Symposium Proceedings, pp. 21–25 (2003)
Google Scholar
Windeatt, T.: Ensemble MLP Classifier Design, vol. 137, pp. 133–147. Springer, Heidelberg (2008)
Google Scholar
Windeatt, T.: Accuracy/diversity and ensemble MLP classifier design. IEEE Transactions on Neural Networks 17(5), 1194–1211 (2006)
Article Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Almuallim, H., Dietterich, T.G.: Learning with many irrelevant features. In: Proceedings of the Ninth National Conference on Artificial Intelligence, pp. 547–552. AAAI Press, Menlo Park (1991)
Google Scholar
Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Proceeding of the 17th International Conference on Machine Learning, pp. 359–366. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)
MathSciNet Google Scholar
Tsamardinos, I., Brown, L.E., Aliferis, C.F.: The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning 65, 31–78 (2006)
Article Google Scholar
Wang, M., Chen, Z., Cloutier, S.: A hybrid Bayesian network learning method for constructing gene networks. Computational Biology and Chemistry 31, 361–372 (2007)
Article MATH Google Scholar
Spirtes, P., Glymour, C., Schinese, R.: Causation, Prediction, and search. Springer, New York (1993)
MATH Google Scholar
Cheng, J., Bell, D., Liu, W.: Learning Belief Networks from Data: An Information theory Based Approach. In: Proceedings of the Sixth ACM International Conference on Information and Knowledge Management, pp. 325–331 (1997)
Google Scholar
Tsamardinos, I., Aliferis, C.F., Statnikov, A.: Time and Sample Efficient Discovery of Markov Blankets and Direct Causal Relations. In: KDD 2003, Washington DC, USA (2004)
Google Scholar
Friedman, N., Nachman, I., Peer, D.: Learning of Bayesian Network Structure from Massive Datasets: The “Sparse Candidate” Algorithm. In: Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (UAI), pp. 206–215. Morgan Kaufmann, Stockholme (1999)
Google Scholar
Pudil, P., Novovicova, J., Kitler, J.: Floating Search Methods in Feature Selection. Pattern Recognition Letters 15, 1,119–1,125 (1994)
Article Google Scholar
Brown, L.E., Tsamardinos, I., Aliferis, C.F.: A Novel Algorithm for Scalable and Accurate Bayesian Network Learning. Medinfo. 11, 711–715 (2004)
Google Scholar
Brown, L.E., Tsamardinos, I.: Markov Blanket-Based Variable Selection. Technical Report DSL TR-08-01 (2008)
Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
MATH MathSciNet Google Scholar
Asuncion, A., Newman, D.: UCI machine learning repository (2007), http://www.ics.uci.edu/mlearn/MLRepository.html
Guyon, I.: Causality Workbench (2008), http://www.causality.inf.ethz.ch/home.php

Download references

Author information

Authors and Affiliations

Center for Vision, Speech and Signal Processing, University of Surrey, Guildford, United Kingdom, GU2 7XH
Rakkrit Duangsoithong & Terry Windeatt

Authors

Rakkrit Duangsoithong
View author publications
You can also search for this author in PubMed Google Scholar
Terry Windeatt
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Neural Information Processing, Oberer Eselsberg, University of Ulm, 89069, Ulm, Germany
Friedhelm Schwenker
Center for Informatics Science, Nile University, 12677, Giza, Egypt
Neamat El Gayar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Duangsoithong, R., Windeatt, T. (2010). Correlation-Based and Causal Feature Selection Analysis for Ensemble Classifiers. In: Schwenker, F., El Gayar, N. (eds) Artificial Neural Networks in Pattern Recognition. ANNPR 2010. Lecture Notes in Computer Science(), vol 5998. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12159-3_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-12159-3_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12158-6
Online ISBN: 978-3-642-12159-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Correlation-Based and Causal Feature Selection Analysis for Ensemble Classifiers

Abstract

Chapter PDF

Similar content being viewed by others

Building Weighted Classifier Ensembles Through Classifiers Pruning

A New Multi-classifier Ensemble Algorithm Based on D-S Evidence Theory

Features’ Associations in Fuzzy Ensemble Classifiers

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Correlation-Based and Causal Feature Selection Analysis for Ensemble Classifiers

Abstract

Chapter PDF

Similar content being viewed by others

Building Weighted Classifier Ensembles Through Classifiers Pruning

A New Multi-classifier Ensemble Algorithm Based on D-S Evidence Theory

Features’ Associations in Fuzzy Ensemble Classifiers

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation