Recursive Feature Elimination Based on Linear Discriminant Analysis for Molecular Selection and Classification of Diseases

  • Edmundo Bonilla Huerta
  • Roberto Morales Caporal
  • Marco Antonio Arjona
  • José Crispín Hernández Hernández
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7996)


We propose an effective Recursive Feature Elimination based on Linear Discriminant Analysis (RFELDA) method for gene selection and classification of diseases obtained from DNA microarray technology. LDA is proposed not only as an LDA classifier, but also as an LDA’s discriminant coefficients to obtain ranks for each gene. The performance of the proposed algorithm was tested against four well-known datasets from the literature and compared with recent state of the art algorithms. The experiment results on these datasets show that RFELDA outperforms similar methods reported in the literature, and obtains high classification accuracies with a relatively small number of genes.


Gene Selection Classification LDA RFE Microarray Filter 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alon, U., Barkai, N., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Nat. Acad. Sci. USA (1999)Google Scholar
  2. 2.
    Alizadeh, A., Eisen, M.B., et al.: Distinct types of diffuse large (b)–cell lymphoma identified by gene expression profiling. Nature, 503–511 (2000)Google Scholar
  3. 3.
    Golub, T., Slonim, D., et al.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 537, 286 (1999)Google Scholar
  4. 4.
    Dudoit, S., Fridlyand, J., Speed, T.: Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association 97, 77–87 (2002)MathSciNetzbMATHCrossRefGoogle Scholar
  5. 5.
    Ye, J., Li, T., Xiong, T., Janardan, R.: Using uncorrelated discriminant analysis for tissue classification with gene expression data. IEEE/ACM Trans. Comput. 1(4), 181–190 (2004)Google Scholar
  6. 6.
    Yue, F., Wang, K., Zuo, W.: Informative gene selection and tumor classification by null space LDA for microarray data. In: Chen, B., Paterson, M., Zhang, G. (eds.) ESCAPE 2007. LNCS, vol. 4614, pp. 435–446. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  7. 7.
    Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46(1-3), 389–422 (2002)zbMATHCrossRefGoogle Scholar
  8. 8.
    Tang, Y., Zhang, Y.-Q., Huang, Z.: Fcmsv- rfe gene feature selection algorithm for leukemia classification from microarray gene expression data. In: IEEE International Conference on Fuzzy Systems, pp. 97–10 (2005)Google Scholar
  9. 9.
    Luo, L.-K., Feng, D., Ye, L.-J., Zhou, Q.-F., Shao, G.-F., Peng, H.: Improving the computational efficiency of recursive cluster elimination for gene selection. IEEE/ACMTransactions on Computational Biology and Bioinformatics 8(1), 122–129 (2011)CrossRefGoogle Scholar
  10. 10.
    Liu, Q., Sung, H.: Gene selection and classification for cancer microarray data based on machine learning and similarity measures. BMC Genomics 12(5), 1–12 (2011)CrossRefGoogle Scholar
  11. 11.
    Yang, F., Mao, K.: Robust feature selection for microarray based on multicreterion fusion. IEEE/ACMTrans. Comput. Biology 8(4), 1080–1092 (2011)Google Scholar
  12. 12.
    Li, Z., Zeng, X.-Q., Yang, J.-Y., Yang, M.-Q.: Partial Least Squares based dimension reduction with gene selection for tumor classification. In: BIBE 2007, pp. 1439–1444 (2007)Google Scholar
  13. 13.
    Deng, L., Pei, J., Ma, J., Lee, D.L.: Rank sum test method for informative gene discovery. In: 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2004), pp. 410–419 (2004)Google Scholar
  14. 14.
    Mishra, D., Sahu, B.: Feature selection for cancer classification: A signal-to-noise ratio approach. International Journal of Scientific & Engineering Research 2(4), 1–7 (2011)Google Scholar
  15. 15.
    Pomeroy, S.-L., Tamayo, P., et al.: Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415, 436–442 (2002)CrossRefGoogle Scholar
  16. 16.
    Singh, D., Febbo, P., Ross, K., Jackson, D., Manola, J., Ladd, C., Tamayo, P., Renshaw, A., D’Amico, A., Richie, J.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1, 203–209 (2002)CrossRefGoogle Scholar
  17. 17.
    Cho, S.-B., Won, H.-H.: Cancer classification using ensemble of neural networks with multiple significant gene subsets. Applied Intelligence 26(3), 243–250 (2007)zbMATHCrossRefGoogle Scholar
  18. 18.
    Li, S., Wu, X., Hu, X.: Gene selection using genetic algorithm and support vectors machines. Soft Computing 12(7), 693–698 (2008)CrossRefGoogle Scholar
  19. 19.
    Alba, E., García-Nieto, J., Jourdan, L., Talbi, E.-G.: Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms. In: Congress on Evolutionary Computation, pages, pp. 284–290 (2007)Google Scholar
  20. 20.
    Satoshi, N., Okuno, Y.: Lapalacian linear discriminant analysis to unsupervised feature selection. IEEE/Transactions on Biology and Bioinformatics 6(4), 605–614 (2009)Google Scholar
  21. 21.
    Li, X., Peng, S., Zhan, X., Zhang, J., Xu, Y.: Comparison of feature selection methods for multiclass cancer classification based on microarray data. In: 4th International Conference on Biomedical Engineering and Informatics (BMEI), pp. 1692–1696 (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Edmundo Bonilla Huerta
    • 1
  • Roberto Morales Caporal
    • 1
  • Marco Antonio Arjona
    • 1
  • José Crispín Hernández Hernández
    • 1
  1. 1.Laboratorio de Investigación en Tecnologías InteligentesInstituto Tecnológico de ApizacoApizacoMéxico

Personalised recommendations