Abstract
We address problems of classification in which the number of input components (variables, features) is very large compared to the number of training samples. Such problems are encountered in Internet application such as text filtering, in biomedical applications such as medical diagnosis from genomic or protemic data, and drug screening from combinatorial chemistry data. In this setting, it is often desirable to perform a feature selection to reduce the number of inputs, either for efficiency, performance, or to gain understanding of the data and the classifiers. We compare a number of methods on mass-spectrometric data of human protein sera from asymptomatic patients and prostate cancer patients. We show empirical evidence that, in spite of the high danger of overfitting, non-linear methods can outperform linear methods, both in performance and number of features selected.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
B.-L Adam, et al, Serum Protein Fingerprinting Coupled with a Pattern-matching Algorithm Distinguishes Prostate Cancer from Benign Prostate Hyperplasia and Healthy Men, Cancer Research 62, 3609–3614, July 1, 2002.
B. Boser, I. Guyon, and V. Vapnik, An training algorithm for optimal margin classifiers. In Fifth Annual Workshop on Computational Learning Theory, pages 144–152, Pittsburgh, ACM. 1992.
T. G. Dietterich, Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation, 10(7) 1895–1924.
H. Drucker, D. Wu and V. Vapnik. Support Vector Machines for Spam Categorization. IEEE Trans. on Neural Networks, vol 10, number 5, pp. 1048–1054. 1999.
I. Guyon, J. Makhoul, R. Schwartz, and V. Vapnik, What size test set gives good error rate estimates?. PAMI, 20(1), pages 52–64, IEEE. 1998.
I. Guyon, J. Weston, S. Bamhill, and V. Vapnik. Gene selection for cancer classification using support vector machines. Machine Learning, 46(1–3), pages 389–422, 2002.
I. Guyon, A. Elisseeff, An Introduction to Variable and Feature Selection. JMLR, 3(Mar):l157–1182, 2003.
I. Guyon, S. Gunn, M. Nikravesh, L. Zadeh. Feature extraction: foundations and applications. Book in preparation http://clopinet.com/isabelle/Proiects/NIPS2003/call-for-papers.html.
K. K. Jain. Biochips for Gene Spotting. Science, vol. 294, pages 621–625. Oct. 2001
K. Kira, and L. Rendell, A practical approach to feature selection. In D. Sleeman and P. Edwards (Eds.), Proceedings of the Ninth International Workshop on Machine Learning (ML92) (pp. 249–256). San Mateo, California: Morgan Kaufmann.
B. Schölkopf and A. Smola. Learning with Kernels. MIT Press, Cambridge, MA, 2002.
H. Stoppiglia, G. Dreyfus, R. Dubois, Y. Oussar. Ranking a Random Feature for Variable and Feature Selection. JMLR, 3(Mar):1399–1414, 2003.
C. M. Surman, The Use of Capillary Electrophoresis in Proteomics. GE Global Research Technical Report 2002GCRC138, June 2002.
R. Tibshirani, T. Hastie, B. Narasimhan, and G. Chu, Diagnosis of multiple cancer types by shrunken centroids of gene expression. R. Tibshirani, T. Hastie, B. Narasimhan, and G. Chu. PNAS, 99(10):6567–6572, 2002.
V. Vapnik, Statistical Learning Theory. V. Vapnik. John Wiley & Sons, N.Y., 1998.
J. Weston, F. Perez-Cruz, O. Bousquet, O. Chapelle, A. Elisseeff and B. Schoelkopf. “Feature Selection and Transduction for Prediction of Molecular Bioactivity for Drug Design”. Bioinformatics, vol. 19 no. 6, pages 764–771, 2003.
J. Weston, A. Elisseeff, B. Schölkopf, Use of the Zero-Norm with Linear Models and Kernel Methods, Mike Tipping; JMLR, 3(Mar):1439–1461, 2003.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Guyon, I., Bitter, HM., Ahmed, Z., Brown, M., Heller, J. (2005). Multivariate Non-Linear Feature Selection with Kernel Methods. In: Nikravesh, M., Zadeh, L.A., Kacprzyk, J. (eds) Soft Computing for Information Processing and Analysis. Studies in Fuzziness and Soft Computing, vol 164. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-32365-1_12
Download citation
DOI: https://doi.org/10.1007/3-540-32365-1_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22930-8
Online ISBN: 978-3-540-32365-5
eBook Packages: EngineeringEngineering (R0)