Abstract
The efficiency and scalability of online learning methods make them a popular choice for solving the learning problems with big data and limited memory. Most of the existing online learning approaches are based on global models, which consider the incoming example as linear separable. However, this assumption is not always valid in practice. Therefore, local online learning framework was proposed to solve non-linear separable task without kernel modeling. Weights in local online learning framework are based on the first-order information, thus will significantly limit the performance of online learning. Intuitively, the second-order online learning algorithms, e.g., Soft Confidence-Weighted (SCW), can significantly alleviate this issue. Inspired by the second-order algorithms and local online learning framework, we propose a Soft Confidence-Weighted Local Online Learning (SCW-LOL) algorithm, which extends the single hyperplane SCW to the case with multiple local hyperplanes. Those local hyperplanes are connected by a common component and will be optimized simultaneously. We also examine the theoretical relationship between the single and multiple hyperplanes. The extensive experimental results show that the proposed SCW-LOL learns an online convergence boundary, overall achieving the best performance over almost all datasets, without any kernel modeling and parameter tuning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cavallanti, G., Cesa-Bianchi, N., Gentile, C.: Tracking the best hyperplane with a simple budget perceptron. Mach. Learn. 69(2–3), 143–167 (2007)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. TIST 2(3), 27 (2011)
Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms. JMLR 7, 551–585 (2006)
Crammer, K., Kulesza, A., Dredze, M.: Adaptive regularization of weight vectors. In: NIPS, pp. 414–422 (2009)
Dekel, O., Shalev-Shwartz, S., Singer, Y.: The forgetron: a kernel-based perceptron on a budget. SIAM J. Comput. 37(5), 1342–1372 (2008)
Dredze, M., Crammer, K., Pereira, F.: Confidence-weighted linear classification. In: ICML, pp. 264–271. ACM (2008)
Freund, Y., Schapire, R.E.: Large margin classification using the perceptron algorithm. Mach. Learn. 37(3), 277–296 (1999)
Friedman, J.H., Tukey, J.W.: A projection pursuit algorithm for exploratory data analysis. IEEE Trans. Comput. 100(9), 881–890 (1974)
Gou, J., Zhan, Y., Rao, Y., Shen, X., Wang, X., He, W.: Improved pseudo nearest neighbor classification. KBS 70, 361–375 (2014)
Gu, Q., Han, J.: Clustered support vector machines. In: AISTATS, pp. 307–315 (2013)
Hoi, S.C., Wang, J., Zhao, P.: LIBOL: a library for online learning algorithms. JMLR 15(1), 495–499 (2014)
Jose, C., Goyal, P., Aggrwal, P., Varma, M.: Local deep kernel learning for efficient non-linear SVM prediction. In: ICML, pp. 486–494 (2013)
Kivinen, J., Smola, A.J., Williamson, R.C.: Online learning with kernels. IEEE Trans. Sig. Process. 52(8), 2165–2176 (2004)
Ladicky, L., Torr, P.: Locally linear support vector machines. In: ICML, pp. 985–992 (2011)
Lu, J., Hoi, S., Wang, J.: Second order online collaborative filtering. In: ACML, pp. 325–340 (2013)
Lu, J., Hoi, S.C., Wang, J., Zhao, P., Liu, Z.Y.: Large scale online kernel learning. JMLR 17(47), 1 (2016)
MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA, vol. 1, pp. 281–297 (1967)
Rosenblatt, F.: The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65(6), 386 (1958)
Wang, J., Wan, J., Zhang, Y., Hoi, S.C.: Solar: scalable online learning algorithms for ranking. In: ACL (2015)
Wang, J., Zhao, P., Hoi, S.C.: Cost-sensitive online classification. TKDE 26(10), 2425–2438 (2014)
Wang, J., Zhao, P., Hoi, S.C.: Soft confidence-weighted learning. ACM Trans. Intell. Syst. Technol. (TIST) 8(1), 15 (2016)
Wang, Z., Crammer, K., Vucetic, S.: Breaking the curse of kernelization: budgeted stochastic gradient descent for large-scale SVM training. JMLR 13, 3103–3131 (2012)
Weinberger, K., Dasgupta, A., Langford, J., Smola, A., Attenberg, J.: Feature hashing for large scale multitask learning. In: ICML, pp. 1113–1120. ACM (2009)
Wu, P., Hoi, S.C., Xia, H., Zhao, P., Wang, D., Miao, C.: Online multimodal deep similarity learning with application to image retrieval. In: MM, pp. 153–162 (2013)
Zhao, P., Hoi, S.C.: Cost-sensitive online active learning with application to malicious URL detection. In: SIGKDD, pp. 919–927. ACM (2013)
Zhou, Z., Zheng, W.S., Hu, J.F., Xu, Y., You, J.: One-pass online learning: a local approach. Pattern Recogn. 51, 346–357 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Yang, X., Zhou, J., Zhao, P., Chen, C., Chen, C., Li, X. (2018). A Local Online Learning Approach for Non-linear Data. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10938. Springer, Cham. https://doi.org/10.1007/978-3-319-93037-4_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-93037-4_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93036-7
Online ISBN: 978-3-319-93037-4
eBook Packages: Computer ScienceComputer Science (R0)