Performance Tuning of PCA by CFS-Shapley Ensemble and Its Application to Medical Diagnosis
Selection of optimal features is an important area of research in medical data mining systems. Principal component analysis (PCA) is one among the most popular feature selection methods. Still PCA faces a drawback – i.e., the measurements from all of the original features are used in the projection to the lower dimensional space. Hence this work is aimed to tune the performance of PCA and classify the medical profiles. The proposed method is realized as an ensemble procedure with three steps – (i) feature selection using PCA, (ii) feature ranking with CFS and (iii) dimension reduction using Shapley Values Analysis. The variance coverage parameter of PCA is adjusted so as to yield maximum accuracy which are measured with specificity, sensitivity, precision and recall. This facilitates the selection of a compact set of superior features with uncompromised detection rates, remarkably at a low cost. To appraise the success of the proposed method, experiments were conducted across 6 different medical data sets using J48 decision tree classifier, which showed that the proposed procedure improves the classification efficiency and accuracy compared with individual usage.
KeywordsData mining Dimensionality reduction Feature Extraction Feature selection Principal component analysis Shapley value Analysis Classification
- 1.Fernández-Navarro, F., et al.: Evolutionary Generalized Radial Basis Function neural networks for improving prediction accuracy in gene classification using feature selection. Applied Soft Computing Journal (2012), doi:10.1016/j.asoc.2012.01.008Google Scholar
- 5.Shapley, L.S.: A value for n-person games. In: Kuhn, H.W., Tucker, A.W. (eds.) Contributions to the Theory of Games Annals of Mathematics Studies, vol. II(28), pp. 307–317. Princeton University Press, Princeton (1953)Google Scholar
- 6.Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/mlearn/MLRepository.html
- 7.Weka 3: Machine Learning Software in Java. The University of Waikato software documentation, http://www.cs.waikato.ac.nz/_ml/weka