Abstract
Microarray gene expression technique can provide snap shots of gene expression levels of samples. This technique is promising to be used in clinical diagnosis and genomic pathology. However, the curse of dimensionality and other problems have been challenging researchers for a decade. Selecting a few discriminative genes is an important choice. But gene subset selection is a NP hard problem. This paper proposes an effective gene selection framework. This framework integrates gene filtering, sample selection, and multiobjective evolutionary algorithm (MOEA). We use MOEA to optimize four objective functions taking into account of class relevance, feature redundancy, classification performance, and the number of selected genes. Experimental comparison shows that the proposed approach is better than a well-known recursive feature elimination method in terms of classification performance and time complexity.
Chapter PDF
References
Zhang, A.: Advanced Analysis of Gene Expression Microarray Data. World Scientific, Singapore (2009)
Li, Y., Ngom, A.: Non-Negative Matrix and Tensor Factorization Based Classification of Clinical Microarray Gene Expression Data. In: BIBM, pp. 438–443. IEEE Press, New York (2010)
Brunet, J.P., Tamayo, P., Golub, T.R., Mesirov, J.P.: Metagenes and Molecular Pattern Discovery Using Matrix Factorization. PNAS 101(12), 4164–4169 (2004)
Lee, D.D., Seung, S.: Learning the Parts of Objects by Non-Negative Matrix Factorization. Nature 401, 788–791 (1999)
Saeys, Y., Inza, I., Larrañaga, P.: A Review of Feature Selection Techniques in Bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)
Ding, C., Peng, H.: Munimun Redundancy Feature Selection from Microarray Gene Expression Data. Journal of Bioinformatics and Computational Biology 3(2), 185–205 (2005)
Peng, H., Long, F., Ding, C.: Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(8), 1226–1238 (2005)
Guyon, I., Weston, J., Barnhill, S.: Gene Selection for Cancer Classification Using Support Vector Machines. Machine Learning 46, 389–422 (2002)
Mundra, P.A., Rajapakse, J.C.: Gene and Sample Selection for Cancer Classification with Support Vectors Based t-statistic. Neurocomputing 73(13-15), 2353–2362 (2010)
Mundra, P.A., Rajapakse, J.C.: Support Vectors Based Correlation Coefficient for Gene and Sample Selection in Ccancer Classification. In: CIBCB, pp. 88–94. IEEE Press, New York (2010)
Mundra, P.A., Rajapakse, J.C.: SVM-RFE with MRMR Filter for Gene Selection. IEEE Transactions on Nanobioscience 9(1), 31–37 (2010)
Liu, J., Iba, H.: Selecting Informative Genes Using A Multiobjective Evolutionary Algorithm. In: CEC, vol. 1, pp. 297–302. IEEE Press, New York (2002)
Paul, T.K., Iba, H.: Selection of The Most Useful Subset of Genes for Gene Expression-Based Classification. In: CEC, vol. 2, pp. 2076 - 2083. IEEE Press, New York (2004)
Kohane, I.S., Kho, A.T., Butte, A.J.: Microarrays for An Integrative Genomics. MIT Press, Cambridge (2003)
Kim, H., Park, H.: Sparse Non-Negatice Matrix Factorization via Alternating Non-Negative-Constrained Least Squares for Microarray Data Analysis. Bioinformatics 23(12), 1495–1502 (2007)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Berlin (1995)
Chang, C., Lin, C.: LIBSVM : A Library for Support Vector Machines. ACM Transactions on Intelligent Systems and Technology 2(2), 27:1–27:27 (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
Deb, K.: Multi-Objective Optimization Using Evolutionary Algorithm. Wiley, West Sussex (2001)
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6(2), 182–197 (2002)
Golub, T.R., Slonim, D.K., Tamayo, P., et al.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286(15), 531–537 (1999), http://www.broadinstitute.org/cgi-bin/cancer/datasets.cgi
Pomeroy, S.L., Tamayo, P., Gaasenbeek, M., et al.: Prediction of Central Nervous System Embryonal Tumour Outcome Based on Gene Expression. Nature 415, 436–442 (2002), Data Available at http://www.broadinstitute.org/cgi-bin/cancer/datasets.cgi
Alon, U., Barkai, N., Notterman, D.A., et al.: Broad Patterns of Gene Expression Revealed by Clustering of Tumor and Normal Colon Tissues Probed by Oligonucleotide Arrays. PNAS 96(12), 6745–6750 (1999), Data Available at http://genomics-pubs.princeton.edu/oncology
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, Y., Ngom, A., Rueda, L. (2012). A Framework of Gene Subset Selection Using Multiobjective Evolutionary Algorithm. In: Shibuya, T., Kashima, H., Sese, J., Ahmad, S. (eds) Pattern Recognition in Bioinformatics. PRIB 2012. Lecture Notes in Computer Science(), vol 7632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34123-6_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-34123-6_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34122-9
Online ISBN: 978-3-642-34123-6
eBook Packages: Computer ScienceComputer Science (R0)