Feature Selection Stability Assessment Based on the Jensen-Shannon Divergence

  • Roberto Guzmán-Martínez
  • Rocío Alaiz-Rodríguez
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6911)


Feature selection and ranking techniques play an important role in the analysis of high-dimensional data. In particular, their stability becomes crucial when the feature importance is later studied in order to better understand the underlying process. The fact that a small change in the dataset may affect the outcome of the feature selection/ranking algorithm has been long overlooked in the literature. We propose an information-theoretic approach, using the Jensen-Shannon divergence to assess this stability (or robustness). Unlike other measures, this new metric is suitable for different algorithm outcomes: full ranked lists, partial sublists (top-k lists) as well as the least studied partial ranked lists. This generalized metric attempts to measure the disagreement among a whole set of lists with the same size, following a probabilistic approach and being able to give more importance to the differences that appear at the top of the list. We illustrate and compare it with popular metrics like the Spearman rank correlation and the Kuncheva’s index on feature selection/ranking outcomes artificially generated and on an spectral fat dataset with different filter-based feature selectors.


Feature selection feature ranking stability robustness Jensen-Shannon divergence 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abeel, T., Helleputte, T., Van de Peer, Y., Dupont, P., Saeys, Y.: Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26(3), 392 (2010)CrossRefGoogle Scholar
  2. 2.
    Aslam, J., Pavlu, V.: Query Hardness Estimation Using Jensen-Shannon Divergence Among Multiple Scoring Functions. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECiR 2007. LNCS, vol. 4425, pp. 198–209. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  3. 3.
    Boulesteix, A.-L., Slawski, M.: Stability and aggregation of ranked gene lists 10(5), 556–568 (2009)Google Scholar
  4. 4.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley and Sons, Chichester (2001)zbMATHGoogle Scholar
  5. 5.
    Dunne, K., Cunningham, P., Azuaje, F.: Solutions to instability problems with sequential wrapper-based approaches to feature selection. Trinity College Dublin Computer Science Technical Report, 2002–2028Google Scholar
  6. 6.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)zbMATHGoogle Scholar
  7. 7.
    Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.A.: Feature Extraction: Foundations and Applications. Studies in Fuzziness and Soft Computing. Springer-Verlag New York, Inc., Secaucus (2006)CrossRefzbMATHGoogle Scholar
  8. 8.
    He, Z., Yu, W.: Stable feature selection for biomarker discovery. Technical Report arXiv:1001.0887 (January 2010)Google Scholar
  9. 9.
    Jurman, G., Merler, S., Barla, A., Paoli, S., Galea, A., Furlanello, C.: Algebraic stability indicators for ranked lists in molecular profiling. Bioinformatics 24(2), 258 (2008)CrossRefGoogle Scholar
  10. 10.
    Kalousis, A., Prados, J., Hilario, M.: Stability of feature selection algorithms. In: Fifth IEEE International Conference on Data Mining, p. 8. IEEE, Los Alamitos (2005)Google Scholar
  11. 11.
    Kalousis, A., Prados, J., Hilario, M.: Stability of feature selection algorithms: a study on high-dimensional spaces. Knowledge and Information Systems 12, 95–116 (2007), doi:10.1007/s10115-006-0040-8CrossRefGoogle Scholar
  12. 12.
    Kullback, S., Leibler, R.: On information and sufficiency. The Annals of Mathematical Statistics 22(1), 79–86 (1951)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Kuncheva, L.I.: A stability index for feature selection. In: Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: Artificial Intelligence and Applications, pp. 390–395. ACTA Press (2007)Google Scholar
  14. 14.
    Lin, J.: Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory 37(1), 145–151 (1991)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Loscalzo, S., Yu, L., Ding, C.: Consensus group stable feature selection. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2009, pp. 567–576 (2009)Google Scholar
  16. 16.
    Lustgarten, J.L., Gopalakrishnan, V., Visweswaran, S.: Measuring Stability of Feature Selection in Biomedical Datasets. In: AMIA Annual Symposium Proceedings, vol. 2009, p. 406. American Medical Informatics Association (2009)Google Scholar
  17. 17.
    MATLAB. version 7.10.0 (R2010a). The MathWorks Inc., Natick, Massachusetts (2010)Google Scholar
  18. 18.
    Osorio, M.T., Zumalacrregui, J.M., Alaiz-Rodrguez, R., Guzman-Martnez, R., Engelsen, S.B., Mateo, J.: Differentiation of perirenal and omental fat quality of suckling lambs according to the rearing system from fourier transforms mid-infrared spectra using partial least squares and artificial neural networks. Meat Science 83(1), 140–147 (2009)CrossRefGoogle Scholar
  19. 19.
    Saeys, Y., Abeel, T., Peer, Y.: Robust Feature Selection Using Ensemble Feature Selection Techniques. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 313–325. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  20. 20.
    Somol, P., Novovicova, J.: Evaluating stability and comparing output of feature selectors that optimize feature subset cardinality. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 1921–1939 (2010)CrossRefGoogle Scholar
  21. 21.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)Google Scholar
  22. 22.
    Zucknick, M., Richardson, S., Stronach, E.A.: Comparing the characteristics of gene expression profiles derived by univariate and multivariate classification methods. Statistical Applications in Genetics and Molecular Biology 7(1), 7 (2008)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Roberto Guzmán-Martínez
    • 1
  • Rocío Alaiz-Rodríguez
    • 2
  1. 1.Servicio de Informatica y ComunicacionesUniversidad de LeónLeónSpain
  2. 2.Dpto. de Ingeniería Eléctrica y de SistemasUniversidad de LeonLeónSpain

Personalised recommendations