Improved Sublinear Primal-Dual Algorithm for Support Vector Machines
Sublinear primal-dual algorithm (SUPDA) is a well established sublinear time algorithm. However, SUPDA performs the primal step in every iteration, which is unnecessary since the overall regret of SUPDA is dominated by the dual step. To improve the efficiency of SUPDA, we propose an improved SUPDA (ISUPDA), and apply ISUPDA to linear support vector machines, which yields an improved sublinear primal-dual algorithm for linear support vector machines (ISUPDA-SVM). Specifically, different from SUPDA that conducts the primal step in every iteration, ISUPDA executes the primal step with a probability at each iteration, which can reduce the time complexity of SUPDA. We prove that the expected regret of ISUPDA is still dominated by the dual step and hence ISUPDA guarantees the convergence. We further convert linear support vector machines into saddle-point forms in order to apply ISUPDA to linear support vector machines, and provide the theoretical guarantee of the quality of solution and efficiency for ISUPDA-SVM. Comparison experiments on multiple datasets demonstrate that ISUPDA outperforms SUPDA and that ISUPDA-SVM is an efficient algorithm for linear support vector machines.
KeywordsSublinear primal-dual algorithm Regret analysis Randomized algorithm Linear support vector machines
The work was supported in part by the National Natural Science Foundation of China under grant No. 61673293.
- 4.Cherkassky, V.: The nature of statistical learning theory. IEEE Trans. Neural Netw. Learn. Syst. 8(6), 1–30 (1997)Google Scholar
- 6.Cotter, A., Shalev-Shwartz, S., Srebro, N.: The kernelized stochastic batch perceptron. In: Proceedings of the 29th International Conference on Machine Learning (ICML), pp. 943–950 (2012)Google Scholar
- 7.Garber, D., Hazan, E.: Approximating semidefinite programs in sublinear time. In: Proceedings of the 25th Annual Conference on Neural Information Processing Systems (NIPS), pp. 1080–1088 (2011)Google Scholar
- 9.Hazan, E., Koren, T.: Linear regression with limited observation. In: Proceedings of the 29th International Conference on Machine Learning (ICML), pp. 1865–1872 (2012)Google Scholar
- 10.Hazan, E., Koren, T., Srebro, N.: Beating SGD: learning svms in sublinear time. In: Proceedings of the 25th Annual Conference on Neural Information Processing Systems (NIPS), pp. 1233–1241 (2011)Google Scholar
- 11.Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI), pp. 1137–1145 (1995)Google Scholar
- 12.Peng, H., Wang, Z., Chang, E.Y., Zhou, S., Zhang, Z.: Sublinear algorithms for penalized logistic regression in massive datasets. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012. LNCS (LNAI), vol. 7523, pp. 553–568. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33460-3_41CrossRefGoogle Scholar
- 13.Shalev-Shwartz, S., Singer, Y., Srebro, N.: Pegasos: Primal estimated sub-gradient solver for SVM. In: Proceedings of the 24th International Conference on Machine Learning (ICML), pp. 807–814 (2007)Google Scholar
- 16.Wang, W., Peng, Z., Liu, Z., Zhu, T., Hong, X.: Learning the influence probabilities based on multipolar factors in social network. In: Zhang, S., Wirsing, M., Zhang, Z. (eds.) KSEM 2015. LNCS (LNAI), vol. 9403, pp. 512–524. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25159-2_46CrossRefGoogle Scholar
- 18.Zhang, T.: Solving large scale linear prediction problems using stochastic gradient descent algorithms. In: Proceedings of the 21st International Conference on Machine Learning (ICML), pp. 9–16 (2004)Google Scholar