Abstract
Aiming at the kernel regression of large-scale data, in this paper, we propose a distributed ADMM algorithm based on the Spark platform. It is difficult to calculate and store the kernel matrix of large-scale data. Thus, the Nystrom sampling method is utilized to approximate the kernel matrix, which is applied in solving the kernel regression problem. To verify the effectiveness of the algorithm, we performed numerical experiments on the Spark big data platform. The experimental results show that, given accuracy and computational cost, when the sampling ratio is 2–5%, the kernel matrix reaches the most reasonable approximation degree. The approximate kernel matrix method can solve the problem that the true kernel cannot tackle. Additionally, the approximate kernel regression could be utilized to deal with large-scale data problems, where the computational cost can be greatly reduced and the ideal accuracy can be obtained.
References
Zhang, Y., Duchi, J.C., Wainwright, M.J.: Divide and conquer kernel ridge regression: a distributed algorithm with minimax optimal rates. J. Mach. Learn. Res. 30(1), 592–617 (2013)
Feng, Q., et al.: Center-based weighted kernel linear regression for image classification. In: IEEE international conference on image processing IEEE, 3630–3634 (2015)
Deng, X.G., Tian, X.M.: Kernel regression modeling method based on feature vector selection. Control Eng. China 17(4), 517–520 (2010)
Härdle, W., Vieu, P.: Kernel regression smoothing of time series. J. Time 13(3), 209–232 (2010)
Yang, Y., et al.: Accurate, fast and scalable kernel ridge regression on parallel and distributed systems (2018)
Afonso, M.V.: Fast image recovery using variable splitting and constrained optimization. IEEE Trans. Image Process. 19(9), 2345–2356 (2010)
Zhang, L.S., Liu, H.Y., Lei, D.J.: MapReduce-based parallel linear regression for face recognition. Appl. Mech. Mater. 556–562(11), 2628–2632 (2014)
He, Q., et al.: Parallel extreme learning machine for regression based on MapReduce. Neurocomputing 102(2), 52–58 (2013)
Chen, J., et al.: MR-ELM: a MapReduce-based framework for large-scale ELM training in big data era. Neural Comput. Appl. 27(1), 101–110 (2016)
Asanovic, K., et al.: A view of the parallel computing landscape. Commun. ACM 52(10), 56–67 (2009)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. Siam J. Imaging Sci. 2(1), 183–202 (2009)
Daubechies, I., Defrise, M., Mol, C.D.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57(11), 1413–1457 (2010)
Xiao, Y., Xia, L., Zhang, W.: Face recognition with supervised spectral regression and multiple kernel SVM. In: International Conference on Advanced Computer Control IEEE, pp. 343–346 (2010)
Lin, Y.Y., Liu, T.L., Fuh, C.S.: Multiple kernel learning for dimensionality reduction. IEEE Trans. Pattern Anal. Mach. Intell. 33(6), 1147–1160 (2011)
Li, M., et al.: Large-scale Nyström kernel matrix approximation using randomized SVD. IEEE Trans. Neural Netw. Learn. Syst. 26(1), 152–164 (2014)
Xu, C., et al.: On the feasibility of distributed kernel regression for big data. IEEE Trans. Knowl. Data Eng. 28(11), 3041–3052 (2016)
Acknowledgements
Lina Sun thanks to the NSFC for its support under grant 11690010 and grant 11631013 as well as the support from National Engineering Laboratory for Big Data Analysis. And the authors thanks to the reviewers for their constructive comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sun, L., Jing, W., Zhang, C., Zhu, H. (2020). Approximate Kernel Regression Based on Distributed ADMM Algorithm. In: Jain, V., Patnaik, S., Popențiu Vlădicescu, F., Sethi, I. (eds) Recent Trends in Intelligent Computing, Communication and Devices. Advances in Intelligent Systems and Computing, vol 1006. Springer, Singapore. https://doi.org/10.1007/978-981-13-9406-5_20
Download citation
DOI: https://doi.org/10.1007/978-981-13-9406-5_20
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9405-8
Online ISBN: 978-981-13-9406-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)