Abstract
In this paper, we propose a support vector machine (SVM) ensemble classification method. Firstly, dataset is preprocessed by Wilcoxon rank sum test to filter irrelevant genes. Then one SVM is trained using the training set, and is tested by the training set itself to get prediction results. Those samples with error prediction result or low confidence are selected to train the second SVM, and also the second SVM is tested again. Similarly, the third SVM is obtained using those samples, which cannot be correctly classified using the second SVM with large confidence. The three SVMs form SVM ensemble classifier. Finally, the testing set is fed into the ensemble classifier. The final test prediction results can be got by majority voting. Experiments are performed on two standard benchmark datasets: Breast Cancer, ALL/AML Leukemia. Experimental results demonstrate that the proposed method can reach the state-of-the-art performance on classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Wang, Y., et al.: Gene Selection from Microarray Data for Cancer Classification-A Machine Learning Approach. Computational Biology and Chemistry 29, 37–46 (2005)
Li, T., Zhang, C., Ogihara, M.: A Comparative Study of Feature Selection and Multiclass Classification Methods for Tissue Classification Based on Gene Expression. Bioinformatics 20, 2429–2437 (2004)
Guo, H., Jack, L.B., Nandi, A.K.: Feature Generation Using Genetic Programming with Application to Fault Classification. IEEE Transactions on Systems 35, 89–99 (2005)
Hao, J.-K., Duval, B., Huerta, E.B.: A Hybrid GA/SVM Approach for Gene Selection and Classification of Microarray Data. In: Rothlauf, F., et al. (eds.) EvoWorkshops 2006. LNCS, vol. 3907, pp. 34–44. Springer, Heidelberg (2006)
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)
Park, P.J., Pagano, M., Bonetti, M.: A Nonparametric Scoring Algorithm for Identifying Informative Genes from Microarray Data. In: Pacific Symposium on Biocomputing, pp. 52–63 (2001)
West, M., et al.: Predicting the Clinical Status of Human Breast Cancer Using Gene Expression Profiles. Proceedings of the National Academy of Science 98, 11462–11467 (2001)
Golub, T., et al.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286, 531–537 (1999)
Krishnapuram, B., Carin, L., Hartemink, A.: Gene Expression Analysis: Joint Feature Selection And Classifier Design. In: Schölkopf, B., Tsuda, K., Vert, J. (eds.) Kernel Methods in Computational Biology, MIT Press, Cambridge (2004)
Ben-Dor, A., et al.: Tissue Classification with Gene Expression Profiles. In: Proceedings of the Fourth Annual International Conference on Computational Molecular Biology, pp. 54–64 (2000)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liao, C., Li, S. (2007). A Support Vector Machine Ensemble for Cancer Classification Using Gene Expression Data. In: Măndoiu, I., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2007. Lecture Notes in Computer Science(), vol 4463. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72031-7_44
Download citation
DOI: https://doi.org/10.1007/978-3-540-72031-7_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72030-0
Online ISBN: 978-3-540-72031-7
eBook Packages: Computer ScienceComputer Science (R0)