Abstract
Dimensionality reduction is an essential problem in data analysis that has received a significant amount of attention from several disciplines. It includes two types of methods, i.e., feature extraction and feature selection. In this paper, we introduce a simple method for supervised feature selection for data classification tasks. The proposed hybrid feature selection mechanism (HFS), i.e., RF-SEA (ReliefF-Shapley ensemble analysis) which combines both filter and wrapper models for dimension reduction. In the first stage, we use the filter model to rank the features by the ReliefF(RF) between classes and then choose the highest relevant features to the classes with the help of the threshold. In the second stage, we use Shapley ensemble algorithm to evaluate the contribution of features to the classification task in the ranked feature subset and principal component analysis (PCA) is carried out as preprocessing step before both the steps. Experiments with several medical datasets proves that our proposed approach is capable of detecting completely irrelevant features and remove redundant features without significantly hurting the performance of the classification algorithm and also experimental results show obviously that the RF-SEA method can obtain better classification performance than singly Shapley-value-based or ReliefF (RF)-algorithm based method.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Liu, H, Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. (2005)
Lemke, F., Mueller, J.-A.: Medical data analysis using self-organizing data mining technologies. Syst. Anal. Model. Simul. 43(10), 1399–1408 (2003)
Li, W., Han, J., Pei, J.: CMAR accurate and efficient classification based on multiple association rules. In: Proceedings of 2001 International Conference on Data Mining (2001)
Importance of feature selection in decision-tree and artificial-neural-network ecological applications Alburnus alburnus alborella: A practical example : Tina Tirelli, Daniela Pessani. Ecol. Inf. 6, 309–315 (2011)
Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: Proceedings of the AAAI-92, AAAI Press, pp. 129–134 (1992)
Robnic-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learn. 53(1–2), 23–69 (2003)
Sun, Y., Wu, D.: A Relief based feature extraction algorithm. In: Proceedings of the 2008 SIAM International Conference on Data Mining, pp. 188–195 (2008)
Ghiselli, E.E.: Theory of Psychological Measurement. McGraw_Hill
Quinlan, J.R.: Induction of decision trees. Machine Learn. 1, 81–106 (1986)
Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Networks 5(4), 537–550 (1994)
Shapley, L.S.: A value for n-person games. In: Kuhn, H.W., Tucker, A.W. (eds.) Contributions to the Theory of Games Annals of Mathematics Studies II (28), pp. 307–317. Princeton University Press, Princeton (1953)
Weka 3: Machine learning software in java, The University of Waikato software documentation (http://www.cs.waikato.ac.nz/_ml/weka)
Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases. (http://www.ics.uci.edu/mlearn/MLRepository.html) (1998)
Jolliffe, I.T.: Principal Component Analysis. Springer (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer India
About this paper
Cite this paper
Sasikala, S., Appavu alias Balamurugan, S., Geetha, S. (2014). RF-SEA-Based Feature Selection for Data Classification in Medical Domain. In: Mohapatra, D.P., Patnaik, S. (eds) Intelligent Computing, Networking, and Informatics. Advances in Intelligent Systems and Computing, vol 243. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1665-0_59
Download citation
DOI: https://doi.org/10.1007/978-81-322-1665-0_59
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-1664-3
Online ISBN: 978-81-322-1665-0
eBook Packages: EngineeringEngineering (R0)