On Stability of Ensemble Gene Selection
When the feature selection process aims at discovering useful knowledge from data, not just producing an accurate classifier, the degree of stability of selected features is a very crucial issue. In the last years, the ensemble paradigm has been proposed as a primary avenue for enhancing the stability of feature selection, especially in high-dimensional/small sample size domains, such as biomedicine. However, the potential and the implications of the ensemble approach have been investigated only partially, and the indications provided by recent literature are not exhaustive yet. To give a contribution in this direction, we present an empirical analysis that evaluates the effects of an ensemble strategy in the context of gene selection from high-dimensional micro-array data. Our results show that the ensemble paradigm is not always and necessarily beneficial in itself, while it can be very useful when using selection algorithms that are intrinsically less stable.
KeywordsFeature selection stability Ensemble paradigm Gene selection
This research was supported by Sardinia Regional Government (project CRP‐17615, DENIS: Dataspaces Enhancing the Next Internet in Sardinia).
- 4.Awada, W., Khoshgoftaar, T.M., Dittman, D., Wald, R., Napolitano, A.: A review of the stability of feature selection techniques for bioinformatics data. In: IEEE 13th International Conference on Information Reuse and Integration, pp. 356–363. IEEE (2012)Google Scholar
- 7.Kuncheva, L.I., Smith, C.J., Syed, Y., Phillips, C.O., Lewis, K.E.: Evaluation of feature ranking ensembles for high-dimensional biomedical data: a case study. In: IEEE 12th International Conference on Data Mining Workshops, pp. 49–56. IEEE (2012)Google Scholar
- 9.Dessì, N., Pes, B.: Stability in biomarker discovery: does ensemble feature selection really help? In: Ali, M., Kwon, Y.S., Lee, C.-H., Kim, J., Kim, Y. (eds.) IEA/AIE 2015. LNCS, vol. 9101, pp. 191–200. Springer, Heidelberg (2015)Google Scholar
- 12.Dessì, N., Pascariello, E., Pes, B.: A comparative analysis of biomarker selection techniques. BioMed Res. Int. 2013, Article ID 387673 (2013)Google Scholar
- 13.Wald, R., Khoshgoftaar, T.M., Dittman, D., Awada, W., Napolitano, A.: An extensive comparison of feature ranking aggregation techniques in bioinformatics. In: IEEE 13th International Conference on Information Reuse and Integration, pp. 377–384. IEEE (2012)Google Scholar
- 14.Kuncheva, L.I.: A stability index for feature selection. In: 25th IASTED International Multi-Conference: Artificial Intelligence and Applications, pp. 390–395. ACTA Press Anaheim, CA, USA (2007)Google Scholar