Training Recurrent Neural Network Using Multistream Extended Kalman Filter on Multicore Processor and Cuda Enabled Graphic Processor Unit
Recurrent neural networks are popular tools used for modeling time series. Common gradient-based algorithms are frequently used for training recurrent neural networks. On the other side approaches based on the Kalman filtration are considered to be the most appropriate general-purpose training algorithms with respect to the modeling accuracy. Their main drawbacks are high computational requirements and difficult implementation. In this work we first provide clear description of the training algorithm using simple pseudo-language. Problem with high computational requirements is addresses by performing calculation on Multicore Processor and CUDA-enabled graphic processor unit. We show that important execution time reduction can be achieved by performing computation on manycore graphic processor unit.
KeywordsWeight Connection Recurrent Neural Network Hide Unit Output Unit Previous Time Step
Unable to display preview. Download preview PDF.
- 3.Williams, R.J.: Some observations on the use of the extended Kalman filter as a recurrent network learning algorithm. Technical Report NU-CCS-92-1, Northeastern University, College of Computer Science, Boston, MA (1992)Google Scholar
- 4.Čerňanský, M., Beňušková, Ľ.: Simple recurrent network trained by RTRL and extended Kalman filter algorithms. Neural Network World 13(3), 223–234 (2003)Google Scholar
- 5.Trebatický, P.: Recurrent neural network training with the kalman filter-based techniques. Neural network world 15(5), 471–488 (2005)Google Scholar
- 8.NVIDIA: NVIDIA CUDA programming guide. Technical report (2008)Google Scholar
- 10.Prokhorov, D.V.: Kalman filter training of neural networks: Methodology and applications. In: Tutorial on IJCNN 2004, Budapest, Hungary (2004)Google Scholar
- 12.Elman, J.: Distributed representations, simple recurrent networks, and grammatical structure. Machine Learning 7, 195–225 (1991)Google Scholar