Selecting Features with SVM
A common problem with feature selection is to establish how many features should be retained at least so that important information is not lost. We describe a method for choosing this number that makes use of Support Vector Machines. The method is based on controlling an angle by which the decision hyperplane is tilt due to feature selection.
Experiments were performed on three text datasets generated from a Wikipedia dump. Amount of retained information was estimated by classification accuracy. Even though the method is parametric, we show that, as opposed to other methods, once its parameter is chosen it can be applied to a number of similar problems (e.g. one value can be used for various datasets originating from Wikipedia). For a constant value of the parameter, dimensionality was reduced by from 78% to 90%, depending on the data set. Relative accuracy drop due to feature removal was less than 0.5% in those experiments.
Keywordsfeature selection SVM documents categorization
- 1.Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Proceedings of the Fourteenth International Conference on Machine Learning, ICML 1997, pp. 412–420. Morgan Kaufmann Publishers Inc., San Francisco (1997)Google Scholar
- 4.Brank, J., Grobelnik, M.: Feature selection using linear support vector machines (2002)Google Scholar
- 7.Rzeniewicz, J.: Analysis methods for intercategorial links. Master’s thesis, Gdansk University of Technology (2013)Google Scholar
- 10.Balicki, J., Krawczyk, H., Rymko, Ł., Szymański, J.: Selection of Relevant Features for Text Classification with K-NN. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2013, Part II. LNCS (LNAI), vol. 7895, pp. 477–488. Springer, Heidelberg (2013)CrossRefGoogle Scholar