Advertisement

Neural Network Circuits and Parallel Implementations

  • Ke-Lin DuEmail author
  • M. N. S. Swamy
Chapter

Abstract

Hardware and parallel implementations can substantially speed up machine learning algorithms to extend their widespread applications. In this chapter, we first introduce various circuit realizations for popular neural network learning methods. We then introduce their parallel implementations on graphic processing units (GPUs), systolic arrays of processors, and parallel computers.

References

  1. 1.
    Anderson, D. T., Luke, R. H., & Keller, J. M. (2008). Speedup of fuzzy clustering through stream processing on graphics processing units. IEEE Transactions on Fuzzy Systems, 16(4), 1101–1106.Google Scholar
  2. 2.
    Andraka, R. (1998). A survey of CORDIC algorithms for FPGA based computers. In Proceedings of ACM/SIGDA International Symposium on Field Programmable Gate Arrays (pp. 191–200). Monterey, CA.Google Scholar
  3. 3.
    Anguita, D., & Boni, A. (2003). Neural network learning for analog VLSI implementations of support vector machines: A survey. Neurocomputing, 55, 265–283.CrossRefGoogle Scholar
  4. 4.
    Anguita, D., Boni, A., & Ridella, S. (1999). Learning algorithm for nonlinear support vector machines suited for digital VLSI. Electronics Letters, 35(16), 1349–1350.CrossRefGoogle Scholar
  5. 5.
    Anguita, D., Boni, A., & Ridella, S. (2003). A digital architecture for support vector machines: Theory, algorithm and FPGA implementation. IEEE Transactions on Neural Networks, 14(5), 993–1009.CrossRefGoogle Scholar
  6. 6.
    Anguita, D., Ghio, A., Pischiutta, S., & Ridella, S. (2008). A support vector machine with integer parameters. Neurocomputing, 72, 480–489.CrossRefGoogle Scholar
  7. 7.
    Anguita, D., Pischiutta, S., Ridella, S., & Sterpi, D. (2006). Feed-forward support vector machine without multipliers. IEEE Transactions on Neural Networks, 17(5), 1328–1331.CrossRefGoogle Scholar
  8. 8.
    Anguita, D., Ridella, S., & Rovetta, S. (1998). Circuital implementation of support vector machines. Electronics Letters, 34(16), 1596–1597.CrossRefGoogle Scholar
  9. 9.
    Asanovic, K., & Morgan, N. (1991). Experimental determination of precision requirements for back-propagation training of artificial neural networks. Proceedings of the 2nd International Conference on Microelectronics for Neural Networks (pp. 9–15). Munich, Germany.Google Scholar
  10. 10.
    Aunet, S., Oelmann, B., Norseng, P. A., & Berg, Y. (2008). Real-time reconfigurable subthreshold CMOS perceptron. IEEE Transactions on Neural Networks, 19(4), 645–657.CrossRefGoogle Scholar
  11. 11.
    Baturone, I., Sanchez-Solano, S., Barriga, A., & Huertas, J. L. (1997). Implementation of CMOS fuzzy controllers as mixed-signal integrated circuits. IEEE Transactions on Fuzzy Systems, 5(1), 1–19.CrossRefGoogle Scholar
  12. 12.
    Beiu, V., & Taylor, J. G. (1996). On the circuit complexity of sigmoid feedforward neural networks. Neural Networks, 9(7), 1155–1171.CrossRefGoogle Scholar
  13. 13.
    Bouras, S., Kotronakis, M., Suyama, K., & Tsividis, Y. (1998). Mixed analog-digital fuzzy logic controller with continuous-amplitude fuzzy inferences and defuzzification. IEEE Transactions on Fuzzy Systems, 6(2), 205–215.CrossRefGoogle Scholar
  14. 14.
    Brandstetter, A., & Artusi, A. (2008). Radial basis function networks GPU-based implementation. IEEE Transactions on Neural Networks, 19(12), 2150–2154.CrossRefGoogle Scholar
  15. 15.
    Brown, B., Yu, X., & Garverick, S. (2004). A mixed-mode analog VLSI continuous-time recurrent neural network. In Proceedings of the 2nd IASTED International Conference on Circuits, Signals and Systems (pp. 104–108). Clearwater Beach, FL.Google Scholar
  16. 16.
    Cancelo, G., & Mayosky, M. (1998). A parallel analog signal processing unit based on radial basis function networks. IEEE Transactions on Nuclear Science, 45(3), 792–797.CrossRefGoogle Scholar
  17. 17.
    Cao, L. J., Keerthi, S. S., Ong, C.-J., Zhang, J. Q., Periyathamby, U., Fu, X. J., et al. (2006). Parallel sequential minimal optimization for the training of support vector machines. IEEE Transactions on Neural Networks, 17(4), 1039–1049.CrossRefGoogle Scholar
  18. 18.
    Catanzaro, B., Sundaram, N., & Keutzer, K. (2008). Fast support vector machine training and classification on graphics processors. In Proceedings of the 25th ACM International Conference on Machine Learning (pp. 104–111).Google Scholar
  19. 19.
    Chaudhuri, K., Sarwate, A. D., & Sinha, K. (2013). A near-optimal algorithm for differentially-private principal components. Journal of Machine Learning Research, 14, 2905–2943.MathSciNetzbMATHGoogle Scholar
  20. 20.
    Choi, J., Sheu, B. J., & Chang, J. C. F. (1994). A Gaussian synapse circuit for analog VLSI neural networks. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2(1), 129–133.Google Scholar
  21. 21.
    Chua, L. O. (1971). Memristor-the missing circuit element. IEEE Transactions on Circuit Theory, 18(5), 507–519.CrossRefGoogle Scholar
  22. 22.
    Churcher, S., Murray, A. F., & Reekie, H. M. (1993). Programmable analogue VLSI for radial basis function networks. Electronics Letters, 29(18), 1603–1605.CrossRefGoogle Scholar
  23. 23.
    Cichocki, A. (1992). Neural network for singular value decomposition. Electronics Letters, 28(8), 784–786.CrossRefGoogle Scholar
  24. 24.
    Costa, A., De Gloria, A., Farabosch, P., Pagni, A., & Rizzotto, G. (1995). Hardware solutions of fuzzy control. Proceedings of the IEEE, 83(3), 422–434.CrossRefGoogle Scholar
  25. 25.
    Culler, D., Estrin, D., & Srivastava, M. (2004). Overview of sensor networks. IEEE. Computer, 37(8), 41–49.CrossRefGoogle Scholar
  26. 26.
    del Campo, I., Echanobe, J., Bosque, G., & Tarela, J. M. (2008). Efficient hardware/software implementation of an adaptive neuro-fuzzy system. IEEE Transactions on Fuzzy Systems, 16(3), 761–778.CrossRefGoogle Scholar
  27. 27.
    Delbruck, T. (1991). ‘Bump’ circuits for computing similarity and dissimilarity of analog voltage. In Proceedings of IEEE International Joint Conference on Neural Networks (Vol. 1, pp. 475–479). Seattle, WA.Google Scholar
  28. 28.
    Di Ventra, M., & Pershin, Y. V. (2013). The parallel approach. Nature Physics, 9, 200–202.CrossRefGoogle Scholar
  29. 29.
    Dong, J.-X., Krzyzak, A., & Suen, C. Y. (2005). Fast SVM training algorithm with decomposition on very large data sets. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(4), 603–618.CrossRefGoogle Scholar
  30. 30.
    Draghici, S. (2002). On the capabilities of neural networks using limited precision weights. Neural Networks, 15, 395–414.CrossRefGoogle Scholar
  31. 31.
    Dwork, C., McSherry, F., Nissim, K., & Smith, A. (2006). Calibrating noise to sensitivity in private data analysis. In S. Halevi & T. Rabin (Eds.), Theory of cryptography, LNCS (Vol. 3876, pp. 265–284). Berlin: Springer.CrossRefGoogle Scholar
  32. 32.
    Elredge, J. G., & Hutchings, B. L. (1994). RRANN: A hardware implementation of the backpropagation algorithm using reconfigurable FPGAs. In Proceedings of IEEE International Conference on Neural Networks (pp. 77–80). Orlando, FL.Google Scholar
  33. 33.
    Feali1, M. S., & Ahmadi, A., (2017). Realistic Hodgkin-Huxley axons using stochastic behavior of memristors. Neural Processing Letters, 45(1), 1–14.Google Scholar
  34. 34.
    Fellus, J., Picard, D., & Gosselin, P.-H. (2015). Asynchronous gossip principal components analysis. Neurocomputing, 169, 262–271.CrossRefGoogle Scholar
  35. 35.
    Fierimonte, R., Scardapane, S., Uncini, A., Panella, M. (2017). Fully decentralized semi-supervised learning via privacy-preserving matrix completion. IEEE Transactions on Neural Networks and Learning Systems, 28(11), 2699–2711.MathSciNetCrossRefGoogle Scholar
  36. 36.
    Gadea, R., Cerda, J., Ballester, F., & Mocholi, A. (2000). Artificial neural network implementation on a single FPGA of a pipelined on-line backprogation. In Proceedings of the 13th International Symposium on System Synthesis (pp. 225–230). Madrid, Spain.Google Scholar
  37. 37.
    Girones, R. G., Palero, R. C., & Boluda, J. C. (2005). FPGA implementation of a pipelined on-line backpropagation. Journal of VLSI Signal Processing, 40, 189–213.CrossRefGoogle Scholar
  38. 38.
    Gobi, A. F., & Pedrycz, W. (2006). The potential of fuzzy neural networks in the realization of approximate reasoning engines. Fuzzy Sets and Systems, 157, 2954–2973.MathSciNetzbMATHCrossRefGoogle Scholar
  39. 39.
    Hardt, M., & Roth, A. (2012). Beating randomized response on incoherent matrices. In Proceedings of the 44th Annual ACM Symposium on Theory of Computing (pp. 1255–1268). New York, NY.Google Scholar
  40. 40.
    Hikawa, H. (2003). A digital hardware pulse-mode neuron with piecewise linear activation function. IEEE Transactions on Neural Networks, 14(5), 1028–1037.CrossRefGoogle Scholar
  41. 41.
    Himavathi, S., Anitha, D., & Muthuramalingam, A. (2007). Feedforward neural network implementation in FPGA using layer multiplexing for effective resource utilization. IEEE Transactions on Neural Networks, 18(3), 880–888.CrossRefGoogle Scholar
  42. 42.
    Hurdle, J. F. (1997). The synthesis of compact fuzzy neural circuits. IEEE Transactions on Fuzzy Systems, 5(1), 44–55.CrossRefGoogle Scholar
  43. 43.
    Hwang, J. N., Vlontzos, J. A., & Kung, S. Y. (1989). A systolic neural network architecture for hidden Markov models. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(12), 1967–1979.MathSciNetCrossRefGoogle Scholar
  44. 44.
    Kang, K., & Shibata, T. (2010). An on-chip-trainable Gaussian-kernel analog support vector machine. IEEE Transactions on Circuits and Systems I, 57(7), 1513–1524.MathSciNetCrossRefGoogle Scholar
  45. 45.
    Kim, C. M., Park, H. M., Kim, T., Choi, Y. K., & Lee, S. Y. (2003). FPGA implementation of ICA algorithm for blind signal separation and adaptive noise canceling. IEEE Transactions on Neural Networks, 14(5), 1038–1046.CrossRefGoogle Scholar
  46. 46.
    Kollmann, K., Riemschneider, K., & Zeider, H. C. (1996). On-chip backpropagation training using parallel stochastic bit streams. In Proceedings of the 5th International Conference on Microelectronics for Neural Networks and Fuzzy Systems (pp. 149–156). Lausanne, Switzerland.Google Scholar
  47. 47.
    Kozlov, A. V., & Singh, J. P. (1994). A parallel Lauritzen-Spiegelhalter algorithm for probabilistic inference. In Proceedings of ACM/IEEE conference on Supercomputing (pp. 320–329). Washington, DC.Google Scholar
  48. 48.
    Kung, S. Y., & Hwang, J. N. (1989). A unified systolic architecture for artificial neural networks. Journal of Parallel and Distributed Computing, 6, 358–387.CrossRefGoogle Scholar
  49. 49.
    Kuo, Y. H., & Chen, C. L. (1998). Generic \(LR\) fuzzy cells for fuzzy hardware synthesis. IEEE Transactions on Fuzzy Systems, 6(2), 266–285.MathSciNetCrossRefGoogle Scholar
  50. 50.
    Lawrence, R. D., Almasi, G. S., & Rushmeier, H. E. (1999). A scalable parallel algorithm for self-organizing maps with applications to sparse data mining problems. Data Mining and Knowledge Discovery, 3, 171–195.CrossRefGoogle Scholar
  51. 51.
    Lazzaro, J., Lyckebusch, S., Mahowald, M. A., & Mead, C. A. (1989). Winner-take-all networks of \(O(n)\) complexity. In D. S. Touretzky (Ed.), Advances in neural information processing systems (Vol. 1, pp. 703–711). San Mateo, CA: Morgan Kaufmann.Google Scholar
  52. 52.
    Lee, B. W., & Shen, B. J. (1992). Design and analysis of analog VLSI neural networks. In B. Kosko (Ed.), Neural networks for signal processing (pp. 229–284). Englewood Cliffs, NJ: Prentice-Hall.Google Scholar
  53. 53.
    Lee, B. W., & Shen, B. J. (1993). Parallel hardware annealing for optimal solutions on electronic neural networks. IEEE Transactions on Neural Networks, 4(4), 588–599.CrossRefGoogle Scholar
  54. 54.
    Le Ly, D., & Chow, P. (2010). High-performance reconfigurable hardware architecture for restricted Boltzmann machines. IEEE Transactions on Neural Networks, 21(11), 1780–1792.CrossRefGoogle Scholar
  55. 55.
    Lemaitre, L., Patyra, M., & Mlynek, D. (1994). Analysis and design of CMOS fuzzy logic controller in current mode. IEEE Journal of Solid-State Circuits, 29(3), 317–322.CrossRefGoogle Scholar
  56. 56.
    Liu, Q., Dang, C., & Cao, J. (2010). A novel recurrent neural network with one neuron and finite-time convergence for \(k\)-winners-take-all operation. IEEE Transactions on Neural Networks, 21(7), 1140–1148.CrossRefGoogle Scholar
  57. 57.
    Lin, S. Y., Huang, R. J., & Chiueh, T. D. (1998). A tunable Gaussian/square function computation circuit for analog neural networks. IEEE Transactions on Circuits and Systems II, 45(3), 441–446.CrossRefGoogle Scholar
  58. 58.
    Lin, S.-J., Hung, Y.-T., & Hwang, W.-J. (2011). Efficient hardware architecture based on generalized Hebbian algorithm for texture classification. Neurocomputing, 74, 3248–3256.CrossRefGoogle Scholar
  59. 59.
    Liu, Y., Jing, W., & Xu, L. (2016). Parallelizing backpropagation neural network using MapReduce and cascading model. Computational Intelligence and Neuroscience, 2016, Article ID 2842780, 11 pages.Google Scholar
  60. 60.
    Lu, Y., Roychowdhury, V., & Vandenberghe, L. (2008). Distributed parallel support vector machines in strongly connected networks. IEEE Transactions on Neural Networks, 19(7), 1167–1178.CrossRefGoogle Scholar
  61. 61.
    Luo, F.-L., Unbehauen, R., & Li, Y.-D. (1997). Real-time computation of singular vectors. Applied Mathematics and Computation, 86, 197–214.MathSciNetzbMATHCrossRefGoogle Scholar
  62. 62.
    Mahapatra, S., & Mahapatra, R. N. (2000). Mapping of neural network models onto systolic arrays. Journal of Parallel and Distributed Computing, 60, 677–689.zbMATHCrossRefGoogle Scholar
  63. 63.
    Majani, E., Erlanson, R., & Abu-Mostafa, Y. (1989). On the \(k\)-winners-take-all network. In D. S. Touretzky (Ed.), Advances in neural information processing systems 1 (pp. 634–642). San Mateo, CA: Morgan Kaufmann.Google Scholar
  64. 64.
    Mann, J. R., & Gilbert, S. (1989). An analog self-organizing neural network chip. In D. S. Touretzky (Ed.), Advances in neural information processing systems 1 (pp. 739–747). San Mateo, CA: Morgan Kaufmann.Google Scholar
  65. 65.
    Marchesi, M., Orlandi, G., Piazza, F., & Uncini, A. (1993). Fast neural networks without multipliers. IEEE Transactions on Neural Networks, 4(1), 53–62.CrossRefGoogle Scholar
  66. 66.
    Marchesi, M. L., Piazza, F., & Uncini, A. (1996). Backpropagation without multiplier for multilayer neural networks. IEE Proceedings—Circuits, Devices and Systems, 143(4), 229–232.CrossRefGoogle Scholar
  67. 67.
    Mayes, D. J., Murray, A. F., & Reekie, H. M. (1996). Pulsed VLSI for RBF neural networks. In Proceedings of the 5th IEEE International Conference on Microelectronics for Neural Networks (pp. 177–184). Lausanne, Switzerland.Google Scholar
  68. 68.
    Navia-Vazquez, A., Gutierrez-Gonzalez, D., Parrado-Hernandez, E., & Navarro-Abellan, J. J. (2006). Distributed support vector machines. IEEE Transactions on Neural Networks, 17(4), 1091–1097.CrossRefGoogle Scholar
  69. 69.
    Oohori, T., & Naganuma, H. (2007). A new backpropagation learning algorithm for layered neural networks with nondifferentiable units. Neural Computation, 19, 1422–1435.zbMATHCrossRefGoogle Scholar
  70. 70.
    Oh, K.-S., & Jung, K. (2004). GPU implementation of neural networks. Pattern Recognition, 37(6), 1311–1314.zbMATHCrossRefGoogle Scholar
  71. 71.
    Palit, I., & Reddy, C. K. (2012). Scalable and parallel boosting with MapReduce. IEEE Transactions on Knowledge and Data Engineering, 24(10), 1904–1916.CrossRefGoogle Scholar
  72. 72.
    Patel, N. D., Nguang, S. K., & Coghill, G. G. (2007). Neural network implementation using bit streams. IEEE Transactions on Neural Networks, 18(5), 1488–1503.CrossRefGoogle Scholar
  73. 73.
    Perfetti, R., & Ricci, E. (2006). Analog neural network for support vector machine learning. IEEE Transactions on Neural Networks, 17(4), 1085–1091.CrossRefGoogle Scholar
  74. 74.
    Pickett, M. D., Medeiros-Ribeiro, G., & Williams, R. S. (2013). A scalable neuristor built with Mott memristors. Nature Materials, 12(2), 114–117.CrossRefGoogle Scholar
  75. 75.
    Rabenseifner, R., & Wellein, G. (2003). Comparison of parallel programming models on clusters of SMP nodes. In H. G. Bock, E. Kostina, H. X. Phu, & R. Rannacher (Eds.), Modeling, simulation and optimization of complex processes (pp. 409–426). Berlin: Springer.zbMATHGoogle Scholar
  76. 76.
    Raina, R., Madhavan, A., & Ng, A. Y. (2009). Large-scale deep unsupervised learning using graphics processors. In Proceedings of ACM International Conference on Machine Learning (pp. 873–880).Google Scholar
  77. 77.
    Rasche, C., & Douglas, R. (2000). An improved silicon neuron. Analog Integrated Circuits and Signal Processing, 23(3), 227–236.CrossRefGoogle Scholar
  78. 78.
    Reyneri, L. M. (2003). Implementation issues of neuro-fuzzy hardware: Going toward HW/SW codesign. IEEE Transactions on Neural Networks, 14(1), 176–194.CrossRefGoogle Scholar
  79. 79.
    Rovetta, S., & Zunino, R. (1999). Efficient training of neural gas vector quantizers with analog circuit implementation. IEEE Transactions on Circuits and Systems II, 46(6), 688–698.CrossRefGoogle Scholar
  80. 80.
    Salapura, V. (2000). A fuzzy RISC processor. IEEE Transactions on Fuzzy Systems, 8(6), 781–790.CrossRefGoogle Scholar
  81. 81.
    Saldana, M., Patel, A., Madill, C., Nunes, D., Wang, D., Styles, H., Putnam, A., Wittig, R., & Chow, P. (2008). MPI as an abstraction for software-hardware interaction for HPRCs. In Proceedings of the 2nd International Workshop on High-Performance Reconfigurable Computing Technology and Applications (pp. 1–10). Austin, TX.Google Scholar
  82. 82.
    Scardapane, S., Fierimonte, R., Di Lorenzo, P., & Panella, M. (2016). A. Uncini. Distributed semi-supervised support vector machines. Neural Networks, 80, 43–52.zbMATHCrossRefGoogle Scholar
  83. 83.
    Schaik, A. (2001). Building blocks for electronic spiking neural networks. Neural Networks, 14, 617–628.CrossRefGoogle Scholar
  84. 84.
    Schneider, R. S., & Card, H. C. (1998). Analog hardware implementation issues in deterministic Boltzmann machines. IEEE Transactions on Circuits and Systems II, 45(3), 352–360.CrossRefGoogle Scholar
  85. 85.
    Seiler, G., & Nossek, J. (1993). Winner-take-all cellular neural networks. IEEE Transactions on Circuits and Systems II, 40(3), 184–190.zbMATHCrossRefGoogle Scholar
  86. 86.
    Serrano-Gotarredona, R., Oster, M., Lichtsteiner, P., & 15 colleagues,. (2009). CAVIAR: A 45k neuron, 5M synapse, 12G connects/s AER hardware sensory-processing-learning-actuating system for high-speed visual object recognition and tracking. IEEE Transactions on Neural Networks, 20(9), 1417–1438.Google Scholar
  87. 87.
    Shyu, K.-K., Lee, M.-H., Wu, Y.-T., & Lee, P.-L. (2008). Implementation of pipelined FastICA on FPGA for real-time blind source separation. IEEE Transactions on Neural Networks, 19(6), 958–970.CrossRefGoogle Scholar
  88. 88.
    Soudry, D., Di Castro, D., Gal, A., Kolodny, A., & Kvatinsky, S. (2015). Memristor-based multilayer neural networks with online gradient descent training. IEEE Transactions on Neural Networks and Learning Systems, 26(10), 2408–2421.MathSciNetCrossRefGoogle Scholar
  89. 89.
    Strukov, D. B., Snider, G. S., Stewart, D. R., & Williams, R. S. (2008). The missing memristor found. Nature, 453(7191), 80–83.CrossRefGoogle Scholar
  90. 90.
    Sum, J. P. F., Leung, C. S., Tam, P. K. S., Young, G. H., Kan, W. K., & Chan, L. W. (1999). Analysis for a class of winner-take-all model. IEEE Transactions on Neural Networks, 10(1), 64–71.CrossRefGoogle Scholar
  91. 91.
    Tan, Y., Xia, Y., & Wang, J. (2000). Neural network realization of support vector methods for pattern classification. In Proceedings of IEEE International Joint Conference on Neural Networks (Vol. 6, pp. 411–416). Como, Italy.Google Scholar
  92. 92.
    Traversa, F. L., & Di Ventra, M. (2015). Universal Memcomputing Machines. IEEE Transactions on Neural Networks and Learning Systems, 26(11), 2702–2715.MathSciNetCrossRefGoogle Scholar
  93. 93.
    Trebaticky, P., & Pospichal, J. (2008). Neural network training with extended Kalman filter using graphics processing unit. In Proceedings of the 18th International Conference Artificial Neural Networks (ICANN) (Vol. 2, pp. 198–207). Berlin: Springer.Google Scholar
  94. 94.
    Turing, A. M. (1936). On computational numbers, with an application to the entscheidungsproblem. Proceedings of the London Mathematical Society, 42(2), 230–265.MathSciNetzbMATHGoogle Scholar
  95. 95.
    Tymoshchuk, P. V. (2009). A discrete-time dynamic K-winners-take-all neural circuit. Neurocomputing, 72, 3191–3202.CrossRefGoogle Scholar
  96. 96.
    Urahama, K., & Nagao, T. (1995). K-winners-take-all circuit with \(O(N)\) complexity. IEEE Transactions on Neural Networks, 6, 776–778.CrossRefGoogle Scholar
  97. 97.
    Vanek, J., Michalek, J., & Psutka, J. (2017). A GPU-Architecture Optimized Hierarchical Decomposition Algorithm for Support Vector Machine Training. IEEE Transactions on Parallel and Distributed Systems, 28(12), 3330–3343.CrossRefGoogle Scholar
  98. 98.
    Vrtaric, D., Ceperic, V., & Baric, A. (2013). Area-efficient differential Gaussian circuit for dedicated hardware implementations of Gaussian function based machine learning algorithms. Neurocomputing, 118, 329–333.CrossRefGoogle Scholar
  99. 99.
    Wang, X., & Leeser, M. (2009). A truly two-dimensional systolic array FPGA implementation of QR decomposition. ACM Transactions on Embedded Computing Systems Article, 9(1), Article 3, 1–17.Google Scholar
  100. 100.
    Watkins, S. S., & Chau, P. M. (1992). A radial basis function neurocomputer implemented with analog VLSI circuits. In Proceedings of International Joint Conference on Neural Networks (Vol. 2, pp. 607–612). Baltimore, MD.Google Scholar
  101. 101.
    Weninger, F., Bergmann, J., & Schuller, B. (2015). Introducing CURRENNT: The Munich open-source CUDA RecurREnt Neural Network Toolkit. Journal of Machine Learning Research, 16, 547–551.MathSciNetzbMATHGoogle Scholar
  102. 102.
    Woodsend, K., & Gondzio, J. (2009). Hybrid MPI/OpenMP parallel linear support vector machine training. Journal of Machine Learning Research, 10, 1937–1953.MathSciNetzbMATHGoogle Scholar
  103. 103.
    Xia, Y., & Wang, J. (2004). A one-layer recurrent neural network for support vector machine learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 34(2), 1261–1269.CrossRefGoogle Scholar
  104. 104.
    Xu, X., & Jager, J. (1999). A fast parallel clustering algorithm for large spatial databases. Data Mining and Knowledge Discovery, 3, 263–290.CrossRefGoogle Scholar
  105. 105.
    Yildirim, T., & Marsland, J. S. (1996). A conic section function network synapse and neuron implementation in VLSI hardware. In Proceedings of IEEE International Conference on Neural Networks (Vol. 2, pp. 974–979). Washington, DC.Google Scholar
  106. 106.
    Zanghirati, G., & Zanni, L. (2003). A parallel solver for large quadratic programs in training support vector machines. Parallel Computing, 29, 535–551.MathSciNetzbMATHCrossRefGoogle Scholar
  107. 107.
    Zanni, L., Serafini, T., & Zanghirati, G. (2006). Parallel software for training large scale support vector machines on multiprocessor systems. Journal of Machine Learning Research, 7, 1467–1492.MathSciNetzbMATHGoogle Scholar
  108. 108.
    Zhang, Y., Li, P., Jin, Y., & Choe, Y. (2015). A digital liquid state machine with biologically inspired learning and its application to speech recognition. IEEE Transactions on Neural Networks and Learning Systems, 26(11), 2635–2649.MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Electrical and Computer EngineeringConcordia UniversityMontrealCanada
  2. 2.Xonlink Inc.HangzhouChina

Personalised recommendations