Abstract
Microarrays are emerging technologies that allow biologists to better understand the interactions between disease and normal states, at genes level. However, the amount of data generated by these tools becomes problematic when data are supposed to be automatically analyzed (e.g., for diagnostic purposes). In this work, the authors present a novel gene selection method based on Genetic Algorithms (GAs). The proposed method uses GAs to search for subsets of genes that optimize 2 measures of quality for the clusters presented in the domain. Thus, data are better represented and classification of unknown samples may become easier. In order to demonstrate the strength of the proposed approach, experimental results using 4 public available microarray datasets were carried out.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Schena, M., Knudsen, S.: Guide to Analysis of DNA Microarray Data, and Microarray Analysis Set, 2nd edn. Wiley Publishers, Canada (2004)
Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Longman Publishing (1989)
Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery Data Mining, 1st edn. Kluwer Academic Publishers, Boston (1998)
Li, L., et al.: Gene assessment and sample classification for gene expression data using a genetic algorithm/k-nearest neighbor. Method. Combinat. Chem. and High Throughput Screening 4, 727–739 (2001)
Liu, J., Iba, H., Ishizuka, M.: Selecting informative genes with parallel genetic algorithms in tissue classification. Genome Inform. 12, 14–23 (2001)
Golub, T.R., et al.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
de Souza, B.F., de Carvalho, A., Cancino, W.: Gene subset selection using genetic algorithm and svms. In: Proceedings of SBRN 2004. IEEE, Los Alamitos (2004) (to be published)
Ooi, C.H., Tan, P.: Genetic algorithms applied to multi-class prediction for the analysis of gene expression data. Bioinformatics 19, 37–34 (2003)
Kohavi, R., Sommerfield, D.: Feature subset selection using the wrapper method: Overfitting and dynamic search space topology. In: Proc. 1st Intl. Conf. on Knowledge Discovery and Data Mining, pp. 166–185. AAAI Press, Menlo PArk (1995)
Jain, A., Dubes, R.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1998)
Alon, U., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. In: Proc. Natl. Acad. Sci., USA, vol. 96, pp. 6745–6750. National Academy of Sciences (1999)
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth International, California (1984)
Congdon, C.B.: A comparison of genetic algorithm and other machine learning systems on a complex classification task from common disease research. Phd thesis, University of Michigan (1995)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Recognition, 2nd edn. Wiley-Interscience, Hoboken (2000)
Vapnik, V.N.: The nature of statistical learning theory. Springer, Heidelberg (1995)
Antonov, A.V., et al.: Optimization models for cancer classification: extracting gene interaction information from microarray expression data. Bioinformatics 20, 644–652 (2004)
Fu, L.M., Youn, E.S.: Improving reliability of gene selection from microarray functional genomics data. IEEE Trans. Inf. Technol. Biomed. 7, 191–196 (2003)
Furey, T.S., et al.: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16, 906–914 (2000)
Alizadeh, A.A., et al.: Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)
Li, L., et al.: Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the ga/knn method. Bioinformatics 17, 1131–1142 (2001)
Potamias, G., Koumakis, L., Moustakis, V.S.: Gene selection via discretized gene-expression profiles and greedy feature-elimination. In: Vouros, G.A., Panayiotopoulos, T. (eds.) SETN 2004. LNCS (LNAI), vol. 3025, pp. 256–266. Springer, Heidelberg (2004)
Kadota, K., Nishimura, S.I., Bono, H.: Detection of genes with tissue-specific expression patterns using akaike’s informationcriterion procedure. Physiol. Genomics 12, 251–259 (2003)
Li, J., Wong, L.: Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns. Bioinformatics 18, 1406–1407 (2002)
Khan, J., et al.: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat. Med. 7, 673–679 (2001)
Tibshirani, R., Hastie, T., Narasimhan, B., Chu, G.: Diagnosis of multiple cancer types by shrunken centroids of gene expression. In: Proc. Natl. Acad. Sci., USA, vol. 99, pp. 6567–6572 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
de Souza, B.F., de Carvalho, A.C.P.L.F. (2004). Gene Selection Using Genetic Algorithms. In: Barreiro, J.M., MartĂn-SĂ¡nchez, F., Maojo, V., Sanz, F. (eds) Biological and Medical Data Analysis. ISBMDA 2004. Lecture Notes in Computer Science, vol 3337. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30547-7_48
Download citation
DOI: https://doi.org/10.1007/978-3-540-30547-7_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23964-2
Online ISBN: 978-3-540-30547-7
eBook Packages: Springer Book Archive