Abstract
Back propagation is a well known technique used in the implementation of artificial neural networks. The algorithm can be described essentially as a sequence of matrix vector multiplications and outer product operations interspersed with the application of a point wise non linear function. The algorithm is compute intensive and lends itself to a high degree of parallelism. These features motivate a systolic design of hardware to implement the Back Propagation algorithm. We present in this chapter a new systolic architecture for the complete back propagation algorithm. For a neural network with N input neurons, P hidden layer neurons and M output neurons, the proposed architecture with P processors, has a running time of (2N + 2M + P + max(M,P)) for each training set vector. This is the first such implementation of the back propagation algorithm which completely parallelizes the entire computation of learning phase. The array has been implemented on an Annapolis FPGA based coprocessor and it achieves very favorable performance with range of 5 GOPS. The proposed new design targets Virtex boards.
We also describe the process of automatically deriving these high performance architectures using systolic array design tool MMAlpha. This allows us to specify our system in a very high level language (Alpha) and perform design exploration to obtain architectures whose performance is comparable to that obtained using hand optimized VHDL code.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
R L Walke, R W M Smith, and G Lightbody. 2000. “20 GFLOPS QR processor on a Xilinx Virtex E FPGA,” in SPIE, Advanced Signal Processing Algorithms, Architectures, and Implementations X.
D. E. Rumelhart, G. E. Hinton, and R. J. Williams. 1986. “Learning Internal Representations by Error Propagation,” Nature, vol. 323, pp. 533–536.
Y. L. Cun and et al., 1989. “Backpropagation applied to Handwritten Zip Code Generation,” Neural Computation, vol. 1, no. 4, pp. 541–551.
N. Sundarajan and P. Saratchandran. 1998. Parallel Architectures for Artificial Neural Networks. IEEE Computer Society Press, California, USA, ISBN 0-8186-8399-6.
“The Alpha Homepage,” Information available at http://www.irisa.fr/cosi/ALPHA.
D. Lavenier, P. Quinton, and S. Rajopadhye. 1999. Digital Signal Processing for Multimedia Systems, ch. 23. Parhi and Nishitani editor, Marcel Dekker, New York.
Annapolis Micro Systems Inc. STARFIRE Reference Manual, available at www.annapolis.com.
Annapolis Micro Systems Inc. at http://www.annapmicro.com.
“Xilinx Inc. 2.5V Field Programmable Gate Arrays:Preliminary Product Description,” October 1999. www.xilinx.com.
“Celoxica Inc,” Information available at http://www.celoxica.com.
“Accel Inc,” Information available at http://www.accelchip.com.
“The Alpha Homepage at Brigham Young University,” Information available at http://www.ee.byu.edu:8080/~wilde/Alpha.
“The Alpha Homepage at Colorado State University,” Information available at http://www.cs.colostate.edu/~kolin/Alpha.
“The polyhedral library,” Information available at http://icps.ustrasbg.fr/PolyLib/.
C. Mauras. 1989. Alpha: un langage équationnel pour la conception et la programmation d’arctitectures parallèles synchrones. PhD thesis, Thèse, Université de Rennes 1, IFSIC.
P. Feautrier. 1992. “Some efficient solutions to the affine scheduling problem: I. one-dimensional time,” International Journal of Parallel Programming, vol. 21, no. 5, pp. 313–348.
P. Feautrier. 1992. “Some efficient solutions to the affine scheduling problem: part ii multidimensional time,” International Journal of Parallel Programming, vol. 21, no. 6, pp. 389–420.
“PIP/PipLib 1.2.1 ‘fusion’,” Information available at http://www.prism.uvsq.fr/~cedb/bastools/piplib.html.
“IEEE Standard VHDL Language Reference Manual,” 1994. ANSI/IEEE Std 1076-1993.
S Derrien and S Rajopadhye. 1991. “Loop Tiling for Reconfigurable Accelerators,” in International Workshop on Field Programmable Logic and Applications (FPL’91).
S Derrien and S Sur Kolay and S Rajopadhye. 2000. “Optimal partitioning for FPGA Based Arrays Implementation,” in IEEE PARELEC’0.
J. Burr. 1991. Digital Neural Network Implementations — Concepts, Applications and Implementations, Vol III. Englewood Cliffs, New Jersey: Prentice Hall.
D. Zhang and S. K. Pal. 2002. Neural Networks and Systolic Array Design. World Scientific Company.
Jihan Zhu and Peter Sutton. 2003. “FPGA Implementation of Neural Networks-A Survey of a Decade of Progress,” in 13th International Conference on Field-Programmable Logic and Applications (FPL 2003), Lisbon, Portugal., pp. 1062–1066, Springer-Verlag.
S Y Kung. 1988. “Parallel Achitectures for Artificial Neural Networks,’ in International Conference on Systolic Arrays, pp. 163–174.
S Y Kung and J N Hwang. August 1989. “A Unified Systolic Architecture for Artificial Neural Networks,” Journal of Parallel Distributed Computing, vol. 6, pp. 358–387.
H T Kung. 1988. “How We got 17 million connections per second,” in International Conference on Neural Networks, vol. 2, pp. 143–150.
A T Ferrucci. 1994. A Field Programmable Gate Array Implementation of self adapting and Scalable Connectionist Network. PhD thesis, Master Thesis, University of California, Santa Cruz, California.
James G Elridge and Stephen L Bade. April 1994. “Density Enhancement of a Neural Network using FPGAs and Runtime Reconfiguration,” in IEEEWorkshop on FPGAs for Custom Computing Machines, pp. 180–188, IEEE.
Stephen L Bade and Brad L April 1994. “FPGA-Based Stochaistic Neural Networks,” in IEEE Workshop on FPGAs for Custom Computing Machines, pp. 189–198, IEEE.
X. Zhang and et al. 1990. “An Efficient Implementation of the Back-propagation Algorithm on the Connection Machine,” Advances in Neural Information Processing Systems, vol. 2, pp. 801–809.
N. M. Botros and M Abdul-Aziz. December 1994. “Hardware Implementation of an Artificial Neural Network using Field Programmable Arrays,” IEEE Transactions on Industrial Electronics, vol. 41, pp. 665–667.
A Linde, T Nordstrom, and M Taveniku, “Using FPGAs to implement a Reconfigurable Highly Parallel Computer” 1992. in Selected papers from: Second International Workshop on Field Programmable Logic and Applications (FPL’92), pp. 199–210, Springer-Verlag.
Rafael Gadea, Franciso Ballester, Antonio MocholÍ, and JoaquÍn Cerdá. 2000. “Artificial Neural Network Implementation on a Single FPGA of a Pipelined On-Line Backpropagation,” in Proceedings of the 13th International Symposium on System Synthesis, IEEE.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer
About this chapter
Cite this chapter
Paul, K., Rajopadhye, S. (2006). Back-Propagation Algorithm Achieving 5 Gops on the Virtex-E. In: Omondi, A.R., Rajapakse, J.C. (eds) FPGA Implementations of Neural Networks. Springer, Boston, MA . https://doi.org/10.1007/0-387-28487-7_5
Download citation
DOI: https://doi.org/10.1007/0-387-28487-7_5
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-28485-9
Online ISBN: 978-0-387-28487-3
eBook Packages: EngineeringEngineering (R0)