Investigating Convergence of Linear SVM Implemented in PermonSVM Employing MPRGP Algorithm
This paper deals with the novel PermonSVM machine learning tool. PermonSVM is a part of our PERMON toolbox. It implements the linear two-class Support Vector Machines. PermonSVM is built on top of PermonQP (PERMON module for quadratic programming) which in turn uses PETSc. The main advantage of PermonSVM is that it is parallel. The parallelism comes from a distribution of matrices and vectors. The MPRGP algorithm, implemented in PermonQP, is used as a solver of the quadratic programming problem arising from the dual SVM formulation. The scalability of MPRGP was proven in problems of mechanics with more than billion of unknowns solved on tens of thousands of cores. Apart from the scalability of our approach, we also investigate the relations between training rate, hyperplane margin, the value of the dual functional, and the norm of the projected gradient.
KeywordsSupport Vector Machines SVM PERMON PermonSVM PermonQP MPRGP Quadratic programming QP
This work was supported by the Ministry of Education, Youth and Sports from the National Programme of Sustainability (NPU II) project IT4Innovations excellence in science (LQ1602), and from the Large Infrastructures for Research, Experimental Development and Innovations project IT4Innovations National Supercomputing Center (LM2015070); by the internal student grant competition project SGS No. SP2018/165; by projects LO1404: Sustainable development of CENET, and CZ.1.05/2.1.00/19.0389: Research Infrastructure Development of the CENET; and by the Czech Science Foundation (GACR) projects no. 15-18274S and 17-22615S. We would also like to acknowledge partners in the ExCAPE project for providing us with training datasets related to the Pfam protein database.
- 1.ExCAPE: exascale compound activity prediction. http://www.excape-h2020.eu
- 2.LIBSVM data: classification, regression, and multi-label. https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
- 3.IT4Innovations: Salomon cluster documentation - hardware overview. National Supercomputing Center, VSB-Technical University of Ostrava (2017). https://docs.it4i.cz/salomon-cluster-documentation/hardware-overview
- 4.Balay, S., Abhyankar, S., Adams, M.F., Brown, J., Brune, P., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Rupp, K., Smith, B.F., Zhang, H.: PETSc - Portable, Extensible Toolkit for Scientific Computation. http://www.mcs.anl.gov/petsc
- 10.Hapla, V., Horák, D., Pecha, M.: PermonSVM (2017). http://permon.it4i.cz/permonsvm.htm
- 11.Hapla, V., Horák, D., Čermák, M., Kružík, J., Pospíšil, L., Sojka, R.: PermonQP (2015). http://permon.it4i.cz/qp/
- 12.Horak, D., Dostal, Z., Hapla, V., Kruzik, J., Sojka, R., Cermak, M.: Projector-less TFETI for contact problems: preliminary results. In: Civil-Comp Proceedings, vol. 111 (2017)Google Scholar
- 13.Ma, J., Saul, L., Savage, S., Voelker, G.: Identifying suspicious URLs: an application of large-scale online learning, pp. 681–688 (2009). Cited By 173Google Scholar
- 14.Munson, T., Sarich, J., Wild, S., Benson, S., McInnes, L.C.: TAO users manual. Technical report ANL/MCS-TM-322. Argonne National Laboratory (2015). http://tinyurl.com/tao-man
- 15.Rychetsky, M.: Algorithms and Architectures for Machine Learning Based on Regularized Neural Networks and Support Vector Approaches (Berichte Aus Der Informatik). Shaker Verlag GmbH, Herzogenrath (2001)Google Scholar
- 17.Smith, B.F., et al.: PETSc users manual. Technical report ANL-95/11 - Revision 3.5. Argonne National Laboratory (2016). http://tinyurl.com/petsc-man
- 18.Vishnu, A., Narasimhan, J., Holder, L., Kerbyson, D., Hoisie, A.: Fast and accurate support vector machines on large scale systems. In: 2015 IEEE International Conference on Cluster Computing, pp. 110–119, September 2015Google Scholar