Adaptive Genetic Algorithm to Select Training Data for Support Vector Machines

  • Jakub NalepaEmail author
  • Michal Kawulok
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8602)


This paper presents a new adaptive genetic algorithm (AGA) to select training data for support vector machines (SVMs). SVM training data selection strongly influences the classification accuracy and time, especially in the case of large and noisy data sets. In the proposed AGA, a population of solutions evolves with time. The AGA parameters, including the chromosome length, are adapted according to the current state of exploring the solution space. We propose a new multi-parent crossover operator for an efficient search. A new metric of distance between individuals is introduced and applied in the AGA. It is based on the fast analysis of the vectors distribution in the feature space obtained using principal component analysis. An extensive experimental study performed on the well-known benchmark sets along with the real-world and artificial data sets, confirms that the AGA outperforms a standard GA in terms of the convergence capabilities. Also, it reduces the number of support vectors and allows for faster SVM classification.


Adaptive genetic algorithm Support vector machines Training data selection 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Balcázar, J., Dai, Y., Watanabe, O.: A Random Sampling Technique for Training Support Vector Machines. In: Abe, N., Khardon, R., Zeugmann, T. (eds.) ALT 2001. LNCS (LNAI), vol. 2225, pp. 119–134. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  2. 2.
    Chang, C.C., Pao, H.K., Lee, Y.J.: RSVM based two-teachers-one-student semi-supervised learning algorithm. Neural Networks 25, 57–69 (2012)CrossRefGoogle Scholar
  3. 3.
    Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Trans. on Intell. Systems and Technology 2, 27:1–27:27 (2011)Google Scholar
  4. 4.
    Chien, L.J., Chang, C.C., Lee, Y.J.: Variant methods of reduced set selection for reduced support vector machines. J. Inf. Sci. Eng. 26(1), 183–196 (2010)zbMATHGoogle Scholar
  5. 5.
    Corne, D., Dorigo, M., Glover, F., Dasgupta, D., Moscato, P., Poli, R., Price, K.V.: New ideas in optimization, pp. 219–234. McGraw-Hill Ltd. (1999)Google Scholar
  6. 6.
    Cortes, C., Vapnik, V.: Support-Vector Networks. Mach. Learn. 20(3), 273–297 (1995)zbMATHGoogle Scholar
  7. 7.
    Elamin, E.E.A.: A proposed genetic algorithm selection method. In: 1st National Symposium (NITS), pp. 1–8 (2006)Google Scholar
  8. 8.
    Kawulok, M., Nalepa, J.: Support vector machines training data selection using a genetic algorithm. In: Hancock, E. Imiya, A., Kuijper, A. Kudo, M., Omachi, S., Windeatt, T., Yamada, K.: (eds.): SSPR & SPR 2012, LNCS 7626, pp. 557-565. Springer, Heidelberg (2012)Google Scholar
  9. 9.
    Koggalage, R., Halgamuge, S.: Reducing the number of training samples for fast support vector machine classification. Neural Inf. Process. Lett. and Reviews 2(3), 57–65 (2004)Google Scholar
  10. 10.
    Lee, Y.J., Huang, S.Y.: Reduced support vector machines: A statistical theory. IEEE Trans. on Neural Networks 18(1), 1–13 (2007)CrossRefGoogle Scholar
  11. 11.
    Musicant, D.R., Feinberg, A.: Active set support vector regression. IEEE Trans. on Neural Networks 15(2), 268–275 (2004)CrossRefGoogle Scholar
  12. 12.
    Phung, S.L., Chai, D., Bouzerdoum, A.: Adaptive skin segmentation in color images. In: IEEE Int. Conf. on Acoustics, Speech and Signal Proc., pp. 353–356 (2003)Google Scholar
  13. 13.
    Schohn, G., Cohn, D.: Less is more: Active learning with support vector machines. In: 17th Int. Conf. on Mach. Learn., pp. 839–846. Morgan Kaufmann Inc. (2000)Google Scholar
  14. 14.
    Shin, H., Cho, S.: Neighborhood property-based pattern selection for support vector machines. Neural Comput. 19(3), 816–855 (2007)CrossRefzbMATHGoogle Scholar
  15. 15.
    Tsang, I.W., Kwok, J.T., Cheung, P.M.: Core vector machines: Fast SVM training on very large data sets. J. of Machine Learn. Res. 6, 363–392 (2005)zbMATHMathSciNetGoogle Scholar
  16. 16.
    Wang, D., Shi, L.: Selecting valuable training samples for SVMs via data structure analysis. Neurocomputing 71, 2772–2781 (2008)CrossRefGoogle Scholar
  17. 17.
    Wang, J., Neskovic, P., Cooper, L.N.: Training Data Selection for Support Vector Machines. In: Wang, L., Chen, K., S. Ong, Y. (eds.) ICNC 2005. LNCS, vol. 3610, pp. 554–564. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  18. 18.
    Zeng, Z.Q., Xu, H.R., Xie, Y.Q., Gao, J.: A geometric approach to train SVM on very large data sets. Intell. Sys. and Knowl. Eng. 1, 991–996 (2008)Google Scholar
  19. 19.
    Zhang, W., King, I.: Locating support vectors via \(\beta \)-skeleton technique. In: Int. Conf. on Neural Inf. Process., 1423–1427 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.Silesian University of TechnologyGliwicePoland

Personalised recommendations