Ensemble Feature Ranking Methods for Data Intensive Computing Applications

  • Wilker Altidor
  • Taghi M. Khoshgoftaar
  • Jason Van Hulse
  • Amri Napolitano


This paper presents a novel approach to feature ranking based on the notion of ensemble learning. The combination of multiple learners in the ensemble learning approach is supposedly superior than one single learner. Extending this concept to feature ranking, this paper investigates the ensemble feature ranking approach where multiple feature selection techniques are combined to give one ranking, which can be superior to that of the individual ranking techniques. The ensembles are assessed in terms of their robustness to class noise. This study shows that the robustness of an ensemble depends on the type of techniques in the ensemble. The threshold-based feature selection techniques perform better than the standard filters as components in terms of robustness to class noise. The poor stability of ensembles of standard filters is particularly evident. Also noticeable is the decrease in stability as standard filters are added to a stable ensemble.


Feature Selection Malignant Pleural Mesothelioma Classifier Ensemble Feature Ranking General Ensemble 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    T. Abeel, T. Helleputte, Y. Van de Peer, P. Dupont, and Y. Saeys. Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics, 26(3):392–398, February 2010.CrossRefGoogle Scholar
  2. 2.
    W. Altidor, T. M. Khoshgoftaar, and J. Van Hulse. An empirical study on wrapper-based feature ranking. In Proceedings of the 21st IEEE International Conference on Tools with Artificial Intelligence, ICTAI ’09, pages 75–82, Newark (New York Metropolitan Area), New Jersey, USA, 2009.Google Scholar
  3. 3.
    W. Altidor, T. M. Khoshgoftaar, and J. Van Hulse. Impact of class noise on filter-based feature ranking. Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic Univerisy, 2010. Technical Report.Google Scholar
  4. 4.
    A. Asuncion and D. Newman. UCI machine learning repository []. University of California, Irvine, School of Information and Computer Sciences, 2007.
  5. 5.
    M. Attik. Using ensemble feature selection approach in selecting subset with relevant features. In Proceedings of the 3rd International Symposium on Neural Networks, ISNN ’06, pages 1359–1366, Chengdu, China, 2006.Google Scholar
  6. 6.
    R. Battiti. Using mutual information for selecting features in supervised neural net learning. IEEE Transactions on Neural Networks, 5:537–550, July 1994.CrossRefGoogle Scholar
  7. 7.
    L. Breiman, J. Friedman, C. J. Stone, and R. A. Olshen. Classification and Regression Trees. Chapman & Hall/CRC, January 1984.MATHGoogle Scholar
  8. 8.
    L. Daza and E. Acuna. Feature selection based on a data quality measure. In Proceedings of the World Congress on Engineering - Vol II, WCE ’08, pages 1095–1099, London, U.K., 2008.Google Scholar
  9. 9.
    D. J. Dittman, T. M. Khoshgoftaar, R. Wald, and J. Van Hulse. Comparative analysis of DNA microarray data through the use of feature selection techniques. In Proceedings of the Ninth IEEE International Conference on Machine Learning and Applications, ICMLA ’10, pages 147–152, Washington, DC, USA, December 2010. IEEE Computer Society.Google Scholar
  10. 10.
    T. Fawcett. ROC graphs: Notes and practical considerations for data mining researchers. HPL-2003-4, Intelligent Enterprise Technologies Lab, 2003.Google Scholar
  11. 11.
    G. Forman, I. Guyon, and A. Elisseeff. An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research, 3:1289–1305, 2003.MATHGoogle Scholar
  12. 12.
    I. Guyon and A. Elisseeff. An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157–1182, 2003.MATHGoogle Scholar
  13. 13.
    M. A. Hall and G. Holmes. Benchmarking attribute selection techniques for discrete class data mining. IEEE Transactions on Knowledge and Data Engineering, 15(6):1437–1447, 2003.CrossRefGoogle Scholar
  14. 14.
    K. Jong, J. Mary, A. Cornuéjols, E. Marchiori, and M. Sebag. Ensemble feature ranking. Lecture Notes In Computer Science, pages 267–278, 2004.Google Scholar
  15. 15.
    M. G. Kendall and J. D. Gibbons. Rank Correlation Methods. Oxford University Press, New York, 5th edition, 1990.MATHGoogle Scholar
  16. 16.
    J. L. Y. Koh, M.-L. Lee, W. Hsu, and K.-T. Lam. Correlation-based detection of attribute outliers. In Proceedings of the 12th International Conference on Database Systems for Advanced Applications, DASFAA ’07, pages 164–175, Berlin, Heidelberg, 2007. Springer-Verlag.Google Scholar
  17. 17.
    I. Kononenko. Estimating attributes: Analysis and extensions of RELIEF. In Proceedings of the European conference on Machine Learning, ECML ’94, pages 171–182, Secaucus, NJ, USA, 1994. Springer-Verlag.Google Scholar
  18. 18.
    H. Liu, J. Li, and L. Wong. A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns. Genome Informatics, 13:51–60, 2002.Google Scholar
  19. 19.
    H. Liu and R. Setiono. Chi2: Feature selection and discretization of numeric attributes. In Proceedings of the Seventh International Conference on Tools with Artificial Intelligence, ICTAI ’95, pages 388–391, 1995.Google Scholar
  20. 20.
    D. W. Opitz. Feature selection for ensembles. In Proceedings of the 16th National Conference on Artificial Intelligence, AAAI ’99, pages 379–384. Press, 1999.Google Scholar
  21. 21.
    H. Peng, F. Long, and C. Ding. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27:1226–1238, 2005.CrossRefGoogle Scholar
  22. 22.
    Petricoin, A. M. Ardekani, B. A. Hitt, P. J. Levine, V. A. Fusaro, S. M. Steinberg, G. B. Mills, C. Simone, D. A. Fishman, E. C. Kohn, and L. A. Liotta. Use of proteomic patterns in serum to identify ovarian cancer. The Lancet, 359(9306):572–577, February 2002.Google Scholar
  23. 23.
    J. R. Quinlan. C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1993.Google Scholar
  24. 24.
    S. Ramaswamy, R. Rastogi, and K. Shim. Efficient algorithms for mining outliers from large data sets. Proceedings of the ACM SIGMOD Conference on Management of Data, 29(2):427–438, 2000.CrossRefGoogle Scholar
  25. 25.
    L. Rokach, B. Chizi, and O. Maimon. Feature selection by combining multiple methods. In Advances in Web Intelligence and Data Mining, pages 295–304. Springer, 2006.Google Scholar
  26. 26.
    Y. Saeys, T. Abeel, and Y. de Peer. Towards robust feature selection techniques. In Proceedings of Benelearn, pages 45–46, 2008.Google Scholar
  27. 27.
    Y. Saeys, T. Abeel, and Y. Van De Peer. Robust feature selection using ensemble feature selection techniques. In Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II, ECML PKDD ’08, pages 313–325, Berlin, Heidelberg, 2008. Springer-Verlag.Google Scholar
  28. 28.
    Y. Saeys, I. Inza, and P. Larrañaga. A review of feature selection techniques in bioinformatics. Bioinformatics, 23(19):2507–2517, 2007.CrossRefGoogle Scholar
  29. 29.
    L. E. A. Santana, D. F. de Oliveira, A. M. P. Canuto, and M. C. P. de Souto. A comparative analysis of feature selection methods for ensembles with different combination methods. In Proceedings of the 20th International Joint Conference on Neural Networks, IJCNN ’07, pages 643–648. IEEE, August 2007.Google Scholar
  30. 30.
    SAS Institute. SAS/STAT user’s guide. SAS Institute Inc., 2004.Google Scholar
  31. 31.
    N. Seliya, T. Khoshgoftaar, and J. Van Hulse. A study on the relationships of classifier performance metrics. In Proceedings of the 21st IEEE International Conference on Tools with Artificial Intelligence, ICTAI ’09, pages 59–66, Washington, DC, USA, November 2009. IEEE Computer Society.Google Scholar
  32. 32.
    A. Tsymbal, S. Puuronen, and D. Patterson. Ensemble feature selection with the simple Bayesian classification. Information Fusion, 4(2):87–100, 2003.CrossRefGoogle Scholar
  33. 33.
    K. Tumer and N. C. Oza. Input decimated ensembles. Pattern Analysis and Applications, 6(1):65–77, 2003.CrossRefMATHMathSciNetGoogle Scholar
  34. 34.
    J. Van Hulse and T. Khoshgoftaar. Knowledge discovery from imbalanced and noisy data. Data and Knowledge Engineering, 68(12):1513–1542, December 2009.CrossRefGoogle Scholar
  35. 35.
    J. Van Hulse, T. M. Khoshgoftaar, A. Napolitano, and R. Wald. Threshold-based feature selection techniques for high-dimensional bioinformatics data. Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic Univerisy, 2010. Technical Report.Google Scholar
  36. 36.
    H. Wang, T. M. Khoshgoftaar, and J. Van Hulse. A comparative study of threshold-based feature selection techniques. In Proceeding of the IEEE International Conference on Granular Computing, pages 499–504, San Jose, CA, USA, August 2010. IEEE Computer Society.Google Scholar
  37. 37.
    X. Wang and O. Gotoh. Accurate molecular classification of cancer using simple rules. BMC Medical Genomics, 2(64), October 2009.Google Scholar
  38. 38.
    I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 2nd edition, 2005.Google Scholar
  39. 39.
    Y. Yang and J. O. Pedersen. A comparative study on feature selection in text categorization. In Proceedings of the 14th International Conference on Machine Learning, ICML ’97, pages 412–420, San Francisco, CA, US, 1997. Morgan Kaufmann Publishers.Google Scholar
  40. 40.
    X. Zhu and X. Wu. Class noise vs. attribute noise: a quantitative study of their impacts. Artificial Intelligence Review, 22(3):177–210, 2004.Google Scholar
  41. 41.
    X. Zhu and X. Wu. Cost-guided class noise handling for effective cost-sensitive learning. In ICDM ’04: Proceedings of the Fourth IEEE International Conference on Data Mining, pages 297–304, Washington, DC, USA, 2004. IEEE Computer Society.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Wilker Altidor
    • 1
  • Taghi M. Khoshgoftaar
    • 1
  • Jason Van Hulse
    • 1
  • Amri Napolitano
    • 1
  1. 1.FAUBoca RatonUSA

Personalised recommendations