Skip to main content

A Hybrid Model to Favor the Selection of High Quality Features in High Dimensional Domains

  • Conference paper
Intelligent Data Engineering and Automated Learning - IDEAL 2011 (IDEAL 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6936))

Abstract

Feature selection is a widely recognized challenging task in dealing with application problems with a large number of features and a limited number of training samples. Filters and wrappers are the most popular feature selection strategies, but recent literature shows the emergence of hybrid approaches aiming at combining the strengths of filters and wrappers while avoiding their drawbacks. This paper proposes a new hybrid model for feature selection that takes advantage of a filter method to weight the relevance of each feature. Top-ranked features are selected, in an incremental way, resulting in a set of nested feature spaces of relatively small size. An evolutionary wrapper further refines each space by extracting small subsets of highly predictive features. Extensive experiments on a benchmark microarray dataset state the effectiveness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Saeys, Y., Inza, I., Larranaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)

    Article  Google Scholar 

  2. Li, L., Weinberg, C.R., Darden, T.A., Pedersen, L.G.: Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 17, 1131–1142 (2001)

    Article  Google Scholar 

  3. Reddy, A.R., Deb, K.: Classification of two-class cancer data reliably using evolutionary algorithms. BioSystems 72(2003), 111–129 (2003)

    Google Scholar 

  4. Huerta, E.B., Duval, B., Hao, J.K.: A hybrid GA/SVM approach for gene selection and classification of microarray data. In: Rothlauf, F., Branke, J., Cagnoni, S., Costa, E., Cotta, C., Drechsler, R., Lutton, E., Machado, P., Moore, J.H., Romero, J., Smith, G.D., Squillero, G., Takagi, H. (eds.) EvoWorkshops 2006. LNCS, vol. 3907, pp. 34–44. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  5. Dessì, N., Pes, B.: An Evolutionary Method for Combining Different Feature Selection Criteria in Microarray Data Classification. Journal of Artificial Evolution and Applications. Article ID 803973 (2009)

    Google Scholar 

  6. Kudo, M., Sklansky, J.: Comparison of algorithms that select features for pattern classifiers. Pattern Recognition 33, 25–41 (2000)

    Article  Google Scholar 

  7. Leung, Y., Hung, Y.: A multiple-filter-multiple-wrapper approach to gene selection and microarray data classification. IEEE/ACM Transaction on Computational Biology and Bioinformatics 7(1), 108–117 (2010)

    Article  Google Scholar 

  8. Alon, U., Barkai, N., Notterman, D., Gish, K., Ybarra, S., Mack, D., Levine, A.: Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed by Oligonucleotide Arrays. PNAS 96, 6745–6750 (1999)

    Article  Google Scholar 

  9. Vapnik, V.N.: Statistical Learning Theory. Wiley Interscience, Hoboken (1998)

    MATH  Google Scholar 

  10. Cannas, L.M., Dessì, N., Pes, B.: A filter-based evolutionary approach for selecting features in high-dimensional micro-array data. In: Shi, Z., Vadera, S., Aamodt, A., Leake, D. (eds.) IIP 2010. AICT, vol. 340, pp. 297–307. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  11. Cannas, L.M., Dessì, N., Pes, B.: Tuning evolutionary algorithms in high dimensional classification problems. In: SEBD 2010, Rimini, Italy, pp. 142–149 (2010)

    Google Scholar 

  12. Hall, M., et al.: The WEKA data mining software: an update. SIGKDD Explorations 11(1) (2009)

    Google Scholar 

  13. Statnikov, A., Wang, L., Aliferis, C.F.: A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification. BMC Bioinformatics 9, 319 (2008)

    Article  Google Scholar 

  14. Wang, Y., Makedon, F., Ford, J.C., Pearlman, J.D.: Hykgene: a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data. Bioinformatics 21(8), 1530–1537 (2005)

    Article  Google Scholar 

  15. Peng, S., Xu, Q., Ling, X.B., Peng, X., Du, W., Chen, L.: Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines. FEBS Letters 555(2), 358–362 (2003)

    Article  Google Scholar 

  16. Yu, L., Liu, H.: Redundancy Based Feature Selection for Microarray Data. In: 10th ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining (KDD 2004), pp. 737–742 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cannas, L.M., Dessì, N., Pes, B. (2011). A Hybrid Model to Favor the Selection of High Quality Features in High Dimensional Domains. In: Yin, H., Wang, W., Rayward-Smith, V. (eds) Intelligent Data Engineering and Automated Learning - IDEAL 2011. IDEAL 2011. Lecture Notes in Computer Science, vol 6936. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23878-9_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23878-9_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23877-2

  • Online ISBN: 978-3-642-23878-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics