Advertisement

Chunking-Coordinated-Synthetic Approaches to Large-Scale Kernel Machines

  • Francisco J. González-Castaño
  • Robert R. Meyer
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3036)

Abstract

We consider a kernel-based approach to nonlinear classification that coordinates the generation of “synthetic” points (to be used in the kernel) with “chunking” (working with subsets of the data) in order to significantly reduce the size of the optimization problems required to construct classifiers for massive datasets. Rather than solving a single massive classification problem involving all points in the training set, we employ a series of problems that gradually increase in size and which consider kernels based on small numbers of synthetic points. These synthetic points are generated by solving and combining the results of relatively small nonlinear unconstrained optimization problems. In addition to greatly reducing optimization problem size, the procedure that we describe also has the advantage of being easily parallelized. Computational results show that our method efficiently generates high-performance simple classifiers on a problem involving a realistic dataset.

Keywords

Support Vector Machine Support Vector Testing Correctness Training Point Massive Dataset 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Bradley, P.S., Mangasarian, O.L.: Feature selection via concave minimization and support vector machines. In: Shavlik, J. (ed.) Machine Learning Proceedings of the Fifteenth International Conference (ICML 1998), pp. 82–90. Morgan Kaufmann, San Francisco (1998)Google Scholar
  2. 2.
    Burges, C.J.C.: Simplified Support Vector Decision Rules. In: Saitta, L. (ed.) Proceedings 13th Intl. Conf. on Machine Learning, San Mateo CA, pp. 71–77. Morgan Kaufmann, San Francisco (1996)Google Scholar
  3. 3.
    Burges, C.J.C., Scholkopf, B.: Improving the Accuracy and Speed of Support Vector Machines. In: Mozer, M., Jordan, M., Petsche, T. (eds.) Advances in Neural Information Processing Systems 9, pp. 375–381. MIT Press, Cambridge (1997)Google Scholar
  4. 4.
    Lee, Y.-J., Mangasarian, O.L.: RSVM: Reduced Support Vector Machines. Data Mining Institute technical report 00-07 (July 2000)Google Scholar
  5. 5.
    Murphy, P.M., Aha, A.W.: UCI repository of machine learning databases (1992), www.ics.uci.edu/~mlearn/MLRepository.html
  6. 6.
    Musicant, D.R.: NDC: Normally Distributed Clustered datasets (1998), www.cs.wisc.edu/~musicant/data/ndc
  7. 7.
    Osuna, E., Girosi, F.: Reducing the Run-time complexity of Support Vector Machines. In: Proceedings of the 14th International Conference on Pattern Recognition, Brisbane, Australia (1998)Google Scholar
  8. 8.
    Scholkopf, B., Mika, S., Burges, C.J.C., Knirsch, P., Muller, K.-R., Ratsch, G., Smola, A.J.: Input Space vs. Feature Space in Kernel-Based Methods. IEEE Transactions on Neural Networks 10(5), 1000–1017 (1999)CrossRefGoogle Scholar
  9. 9.
    Scholkopf, B., Smola, A.J.: Kernel machines page (2000), www.kernelmachines.org
  10. 10.
    DeCoste, D., Scholkopf, B.: Training Invariant Support Vector Machines. Machine Learning 46, 161–190 (2002)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Francisco J. González-Castaño
    • 1
  • Robert R. Meyer
    • 2
  1. 1.Departamento de Ingeniería TelemáticaUniversidad de Vigo, Spain, ETSI Telecomunicación, CampusVigoSpain
  2. 2.Computer Sciences DepartmentUniversity of Wisconsin-MadisonUSA

Personalised recommendations