pp 1–27 | Cite as

Optimization of kernel learning algorithm based on parallel architecture

  • Lu Li
  • Xin ChenEmail author


This paper concentrates on a parallel acceleration method of optimizing Gaussian hyper-parameters with the maximum likelihood estimation. In the process of optimizing the hyper-parameters, many calculations of the kernel matrix inversion will be operated. With an increase of the kernel matrix scale, the high computation burden will be generated. In order to improve the calculating efficiency, we introduce a decomposing and iterative (DI) algorithm. This algorithm divides the large-scale kernel matrix into four blocks and solves the matrix inversion with constant iterations. Due to the independency of the calculations of the sub-matrix blocks, it is quite suitable to put the sub-matrix blocks computation in graphics processing unit. Hence, the parallel decomposing and iterative (DIP) algorithm is introduced. The inverted pendulum and ball-plate system experiments are carried out to confirm the effectiveness of the DI and DIP algorithms. Based on the simulation results, the proposed DI and DIP algorithms shed light on real engineering application in the future. This paper also provides a practical and feasible approach to accelerate the optimization of hyper-parameters with maximum likelihood estimation.


Parallel computing Gaussian hyper-parameters Decomposing and iterative 

Mathematics Subject Classification

34H05 90C06 93C40 



This work is supported by National Natural Science Foundation of China under Grant 61473316, the Hubei Provincial Natural Science Foundation of China under Grant Nos. 2017CFA030 and 2015CFA010, and the 111 project under Grant B17040.


  1. 1.
    Sutton RS, Barto AG (1992) Reinforcement learning: an introduction, Bradford book. Mach Learn 8(3–4):225–227Google Scholar
  2. 2.
    Watkins CJ, Dayan P (1992) Technical note: Q-learning. Mach Learn 8(3–4):279–292zbMATHGoogle Scholar
  3. 3.
    Kober J, Bagnell JA, Peters J (2013) Reinforcement learning in robotics: a survey. Int J Robot Res 32(11):1238–1274CrossRefGoogle Scholar
  4. 4.
    Sutton RS (1995) Generalization in reinforcement learning: successful examples using sparse coarse coding. In: Proceedings of the international conference on neural information processing systems, pp 1038–1044Google Scholar
  5. 5.
    Achbany Y, Fouss F, Yen L, Pirotte A (2008) Tuning continual exploration in reinforcement learning: an optimality property of the Boltzmann strategy. Neurocomputing 71(13):2507–2520CrossRefGoogle Scholar
  6. 6.
    Powell W (2008) Approximate dynamic programming: solving the curses of dimensionality. Wiley 24(1):155–157Google Scholar
  7. 7.
    Srinivasan D, Jin X, Cheu R (2005) Adaptive neural network models for automatic incident detection on freeways. Neurocomputing 64(1):473–496CrossRefGoogle Scholar
  8. 8.
    Radhakrishnan A, Kavitha V (2016) Energy conservation in cloud data centers by minimizing virtual machines migration through artificial neural network. Computing 98(11):1–18MathSciNetCrossRefGoogle Scholar
  9. 9.
    Bertsekas D (2007) Dynamic programming and optimal control. Athena Sci 47(6):833–834Google Scholar
  10. 10.
    Valenti M (2007) Approximate dynamic programming with applications in multi-agent systems. Institute of Technology Press, CambridgeGoogle Scholar
  11. 11.
    Abe S (1976) Kernel-based methods. Computing 17(2):163–167MathSciNetCrossRefGoogle Scholar
  12. 12.
    Wu J, Xu X, Lian C, Huang Y (2011) Multi-robot formation control with kernel-based reinforcement learning. Robot 33(3):379–384CrossRefGoogle Scholar
  13. 13.
    Mahadevan S (2005) Proto-value functions: developmental reinforcement learning. In: Proceedings of international conference on machine learning, pp 553–560Google Scholar
  14. 14.
    Staniszewska M, Jarosz J, Jon M, Gamian A (2006) Fast direct policy evaluation using multiscale analysis of Markov diffusion processes. In: Proceedings of international conference on machine learning, pp 601–608Google Scholar
  15. 15.
    Rasmussen C, Nickisch H (2010) Gaussian processes for machine learning (GPML) toolbox. J Mach Learn Res 11(6):3011–3015MathSciNetzbMATHGoogle Scholar
  16. 16.
    Sussner P (2000) Observations on morphological associative memories and the kernel method. Neurocomputing 31(1–4):167–183CrossRefGoogle Scholar
  17. 17.
    Ormoneit D (2001) Kernel-based reinforcement learning in average-cost problems Dirk Ormoneit. IEEE Trans Autom Control 49(2–3):161–178MathSciNetGoogle Scholar
  18. 18.
    Ormoneit D, Glynn P (2002) Kernel-based reinforcement learning in average-cost problems. Mach Learn 49(2–3):161–178CrossRefzbMATHGoogle Scholar
  19. 19.
    Engel Y, Mannor S, Meir R (2003) The Gaussian process approach to temporal difference learning. In: Proceedings of twenty international conference on machine learning, pp 154–161Google Scholar
  20. 20.
    Rasmussen C, Kuss M (2004) Gaussian processes in machine learning. In: Bousquet O, Luxburg U, Rätsch G (eds) Advances in neural information processing systems. MIT Press, CambridgeGoogle Scholar
  21. 21.
    Udluft S, Martinetz T (2006) Kernel rewards regression: an information efficient batch policy iteration approach. In: Proceedings of the international conference on artificial intelligence and applications, pp 428–433Google Scholar
  22. 22.
    David V, Sanchez A (2003) Advanced support vector machines and kernel methods. Neurocomputing 55(1–2):5–20Google Scholar
  23. 23.
    Song T, Li D, Cao L, Hirasawa K (2016) Kernel-based least squares temporal difference with gradient correction. IEEE Trans Neural Netw Learn Syst 27(4):771–782MathSciNetCrossRefGoogle Scholar
  24. 24.
    Colkesen I, Sahin E, Kavzoglu T (2016) Susceptibility mapping of shallow landslides using kernel-based Gaussian process, support vector machines and logistic regression. J Afr Earth Sci 118:53–64CrossRefGoogle Scholar
  25. 25.
    Li G, Wen C, Li Z, Zhang A, Yang F (2013) Model-based online learning with kernels. IEEE Trans Neural Netw Learn Syst 24(3):356–369CrossRefGoogle Scholar
  26. 26.
    Engel Y, Mannor S, Meir R (2005) Reinforcement learning with Gaussian processes. In: Proceedings of the international conference on machine learning, pp 201–208Google Scholar
  27. 27.
    Rasmussen C (2003) Gaussian processes in machine learning. In: Proceedings of the Summer School on machine learning, pp 63–71Google Scholar
  28. 28.
    Xu X, Xie T, Hu D, Lu X (2005) Kernel least-squares temporal difference learning. Int J Inf Technol 11:54–63Google Scholar
  29. 29.
    Xu X, Hu D, Lu X (2007) Kernel-based least squares policy iteration for reinforcement learning. IEEE Trans Neural Netw 18(4):973–992CrossRefGoogle Scholar
  30. 30.
    Xu X, Lian C, Zuo L, He H (2014) Kernel-based approximate dynamic programming for real-time online learning control: an experimental study. IEEE Trans Control Syst Technol 22(1):146–156CrossRefGoogle Scholar
  31. 31.
    Reisinger J, Stone P, Miikkulainen R (2008) Online kernel selection for Bayesian reinforcement learning. In: Proceedings of international conference on machine learning, pp 816–823Google Scholar
  32. 32.
    Kveton B, Theocharous G (2013) Structured kernel-based reinforcement learning. In: Proceedings of the twenty-seventh AAAI conference on artificial intelligence, pp 569–575Google Scholar
  33. 33.
    Chen X, Xie P, Xiong Y, He Y, Wu M (2015) Two-phase iteration for value function approximation and hyperparameter optimization in Gaussian-kernel-based adaptive critic design. Math Probl Eng 9:1–14MathSciNetzbMATHGoogle Scholar
  34. 34.
    NVIDIA (2017) CUDA C Programming Guide version 8.0. Accessed 10 June 2018
  35. 35.
    Zein A, Mccreath E, Rendell A, Smola A (2008) Performance evaluation of the NVIDIA GeForce 8800 GTX GPU for machine learning. In: Proceedings of the international conference on computational science, pp 466–475Google Scholar
  36. 36.
    Kirk D, Hwu W (2010) Programming massively parallel processors: a hands-on approach, vol 11(3). Tsinghua University Press, BeijingGoogle Scholar
  37. 37.
    Volkov V (2010) Better performance at lower occupancy. In: Proceedings of the GPU technology conference, pp 1–6Google Scholar
  38. 38.
    Ketema J, Donaldson A (2017) Termination analysis for GPU kernels. Sci Comput Program 148:1–16CrossRefGoogle Scholar
  39. 39.
    Liu B, Xin Y, Cheung R, Yan H (2014) GPU-based biclustering for microarray data analysis in neurocomputing. Neurocomputing 134(4):239–246CrossRefGoogle Scholar
  40. 40.
    Chen C, Li K, Ouyang A, Tang Z, Li K (2017) GPU-accelerated parallel hierarchical extreme learning machine on Flink for big data. IEEE Trans Syst Man Cybern Syst. 47(10):2740–2753CrossRefGoogle Scholar
  41. 41.
    Chang L, El-Araby E, Dang V, Dao L (2014) GPU acceleration of nonlinear diffusion tensor estimation using CUDA and MPI. Neurocomputing 135(C):328–338CrossRefGoogle Scholar
  42. 42.
    Azarkhish E, Rossi D, Loi I, Benini L (2017) Neurostream: scalable and energy efficient deep learning with smart memory cubes. IEEE Trans Parallel Distrib Syst 99:1–13Google Scholar
  43. 43.
    Bailey D, Ferguson H (1988) A Strassen–Newton algorithm for high-speed parallelizable matrix inversion. In: Proceedings of the ACM/IEEE conference on supercomputing, pp 419–424Google Scholar
  44. 44.
    Sharma G, Agarwala A, Bhattacharya B (2013) A fast parallel Gauss–Jordan algorithm for matrix inversion using CUDA. Comput Struct 128:31–37CrossRefGoogle Scholar
  45. 45.
    Murni A , Ernastuti T, Kerami D (2016) Hypergraph partitioning implementation for parallelizing matrix-vector multiplication using CUDA GPU-based parallel computing. In: Proceedings of the international symposium on current progress in mathematics and sciences, pp 1-6Google Scholar
  46. 46.
    Wang Z, Xu X, Zhao W (2010) Optimizing sparse matrix-vector multiplication on CUDA companion. In: Proceedings of the international conference on education technology and computer, pp 109–113Google Scholar
  47. 47.
    Su X, Xia F, Liu J, Wu L (2018) Event-triggered fuzzy control of nonlinear systems with its application to inverted pendulum systems. Automatica 94:236–248MathSciNetCrossRefzbMATHGoogle Scholar
  48. 48.
    Rubio E (2010) Indirect hierarchical FCMAC control for the ball and plate system. Neurocomputing 73(13–15):2454–2463Google Scholar
  49. 49.
    Alpaslan Yildiz H, Goren-Sumer L (2017) Stabilizing of ball and plate system using an approximate model. In: Proceedings of twenty the international federation of automatic control, vol 50, pp 9601–9606Google Scholar

Copyright information

© Springer-Verlag GmbH Austria, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of AutomationChina University of GeosciencesWuhanChina
  2. 2.Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex SystemsWuhanChina

Personalised recommendations