Skip to main content

On Stability of Ensemble Gene Selection

  • Conference paper
  • First Online:
Intelligent Data Engineering and Automated Learning – IDEAL 2015 (IDEAL 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9375))

Abstract

When the feature selection process aims at discovering useful knowledge from data, not just producing an accurate classifier, the degree of stability of selected features is a very crucial issue. In the last years, the ensemble paradigm has been proposed as a primary avenue for enhancing the stability of feature selection, especially in high-dimensional/small sample size domains, such as biomedicine. However, the potential and the implications of the ensemble approach have been investigated only partially, and the indications provided by recent literature are not exhaustive yet. To give a contribution in this direction, we present an empirical analysis that evaluates the effects of an ensemble strategy in the context of gene selection from high-dimensional micro-array data. Our results show that the ensemble paradigm is not always and necessarily beneficial in itself, while it can be very useful when using selection algorithms that are intrinsically less stable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Saeys, Y., Inza, I., Larranaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)

    Article  Google Scholar 

  2. Kalousis, A., Prados, J., Hilario, M.: Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl. Inf. Syst. 12(1), 95–116 (2007)

    Article  Google Scholar 

  3. Zengyou, H., Weichuan, Y.: Stable feature selection for biomarker discovery. Comput. Biol. Chem. 34, 215–225 (2010)

    Article  Google Scholar 

  4. Awada, W., Khoshgoftaar, T.M., Dittman, D., Wald, R., Napolitano, A.: A review of the stability of feature selection techniques for bioinformatics data. In: IEEE 13th International Conference on Information Reuse and Integration, pp. 356–363. IEEE (2012)

    Google Scholar 

  5. Abeel, T., Helleputte, T., Van de Peer, Y., Dupont, P., Saeys, Y.: Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26(3), 392–398 (2010)

    Article  Google Scholar 

  6. Saeys, Y., Abeel, T., Van de Peer, Y.: Robust feature selection using ensemble feature selection techniques. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 313–325. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  7. Kuncheva, L.I., Smith, C.J., Syed, Y., Phillips, C.O., Lewis, K.E.: Evaluation of feature ranking ensembles for high-dimensional biomedical data: a case study. In: IEEE 12th International Conference on Data Mining Workshops, pp. 49–56. IEEE (2012)

    Google Scholar 

  8. Haury, A.C., Gestraud, P., Vert, J.P.: The influence of feature selection methods on accuracy, stability and interpretability of molecular signatures. PLoS ONE 6(12), e28210 (2011)

    Article  Google Scholar 

  9. Dessì, N., Pes, B.: Stability in biomarker discovery: does ensemble feature selection really help? In: Ali, M., Kwon, Y.S., Lee, C.-H., Kim, J., Kim, Y. (eds.) IEA/AIE 2015. LNCS, vol. 9101, pp. 191–200. Springer, Heidelberg (2015)

    Google Scholar 

  10. Shipp, M.A., Ross, K.N., Tamayo, P., Weng, A.P., et al.: Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat. Med. 8(1), 68–74 (2002)

    Article  Google Scholar 

  11. Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  12. Dessì, N., Pascariello, E., Pes, B.: A comparative analysis of biomarker selection techniques. BioMed Res. Int. 2013, Article ID 387673 (2013)

    Google Scholar 

  13. Wald, R., Khoshgoftaar, T.M., Dittman, D., Awada, W., Napolitano, A.: An extensive comparison of feature ranking aggregation techniques in bioinformatics. In: IEEE 13th International Conference on Information Reuse and Integration, pp. 377–384. IEEE (2012)

    Google Scholar 

  14. Kuncheva, L.I.: A stability index for feature selection. In: 25th IASTED International Multi-Conference: Artificial Intelligence and Applications, pp. 390–395. ACTA Press Anaheim, CA, USA (2007)

    Google Scholar 

  15. Dessì, N., Pes, B.: Similarity of feature selection methods: An empirical study across data intensive classification tasks. Expert Syst. Appl. 42(10), 4632–4642 (2015)

    Article  Google Scholar 

  16. WEKA. http://www.cs.waikato.ac.nz/ml/weka/

Download references

Acknowledgments

This research was supported by Sardinia Regional Government (project CRP‐17615, DENIS: Dataspaces Enhancing the Next Internet in Sardinia).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Barbara Pes .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Dessì, N., Pes, B., Angioni, M. (2015). On Stability of Ensemble Gene Selection. In: Jackowski, K., Burduk, R., Walkowiak, K., Wozniak, M., Yin, H. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2015. IDEAL 2015. Lecture Notes in Computer Science(), vol 9375. Springer, Cham. https://doi.org/10.1007/978-3-319-24834-9_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24834-9_48

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24833-2

  • Online ISBN: 978-3-319-24834-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics