Abstract
A program in a dataflow architecture is represented as a dataflow graph. The dataflow nodes in the graph represent operations to be executed on data. The edges represent a data value being transformed by a dataflow node. Such an architecture can allow exploitation of parallelism, code sharing, and out-of-order execution. The dataflow nodes include operations from a small set of operators: logical operations, switching, addition/subtraction, and multiplication. There is no arithmetic logic unit nor a floating-point unit. As a result, elementary operations for integer, and in particular floating-point, arithmetic are emulated in software. Therefore, when a more advanced functionality such as trigonometric functions is required, we find that the commonly used implementations are inefficient. The inefficiency results in an over-increased dataflow graph that directly translates to wasted area on the silicon, resulting in increased power consumption and lower throughput. Volder proposed the CORDIC algorithm for trigonometric functions, expressed in terms of basic rotations. In this work, we present a correctly-rounded and efficient implementation of the CORDIC algorithm for the dataflow architecture.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The significand is sometimes called the mantissa, but the use of the term mantissa is discouraged and should be used in the context of logarithms.
- 2.
Faithful rounding has a maximum error of one ulp, and is not defined by IEEE-754. It is mentioned for being a less precise rounding mode to emphasize the high precision requirement.
References
IEEE Standard for Floating-Point Arithmetic. IEEE Std 754–2019 (Revision of IEEE 754–2008), pp. 1–84, July 2019. https://doi.org/10.1109/IEEESTD.2019.8766229
Abraham, Z.: Fast evaluation of elementary mathematical functions with correctly rounded last bit. ACM Trans. Math. Softw. (TOMS) (1991). https://dl.acm.org/doi/abs/10.1145/114697.116813
Burgess, N., Milanovic, J., Stephens, N., Monachopoulos, K., Mansell, D.: Bfloat16 processing for neural networks. In: 2019 IEEE 26th Symposium on Computer Arithmetic (ARITH), pp. 88–91, June 2019. https://doi.org/10.1109/ARITH.2019.00022. ISSN 1063-6889
Courbariaux, M., Bengio, Y., David, J.P.: Low precision arithmetic for deep learning. CoRR abs/1412.7024 (2014)
Daramy-Loirat, C., De Dinechin, F., Defour, D., Gallet, M., Gast, N., Lauter,C.: CRLIBM. A library of correctly rounded elementary functions indouble-precision (cit. on pp. xiii, xviii, xxvi, 17, 32, 37, 64, 89) (2010). http://lipforge.ens-lyon.fr/www/crlibm/
Dettmers, T.: 8-bit approximations for parallelism in deep learning. arXiv preprint arXiv:1511.04561 (2015)
Fousse, L., Hanrot, G., Lefèvre, V., Pélissier, P., Zimmermann, P.: MPFR: a multiple-precision binary floating-point library with correct rounding. ACM Trans. Math. Softw. (TOMS) 33(2), 13 (2007)
Goldberg, D.: What every computer scientist should know about floating-point arithmetic. ACM Comput. Surv. (CSUR) 23(1), 5–48 (1991)
Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning ICMLc 2015, Lille, France, vol. 37, pp. 1737–1746. JMLR.org (2015)
Hao, X., Yang, S., Wang, J., Deng, B., Wei, X., Yi, G.: Efficient implementation of cerebellar purkinje cell with the CORDIC algorithm on LaCSNN. Front. Neurosci. 13 (2019). https://doi.org/10.3389/fnins.2019.01078, https://www.frontiersin.org/articles/10.3389/fnins.2019.01078/full
Hekstra, G., Deprettere, E.: Floating point Cordic. In: Proceedings of IEEE 11th Symposium on Computer Arithmetic, pp. 130–137, June 1993. https://doi.org/10.1109/ARITH.1993.378100
Jaeger, A.: OpenLibm (2016)
Johnson, J.: Rethinking floating point for deep learning. arXiv preprint arXiv:1811.01721 (2018)
Kahan, W.: A logarithm too clever by half (2004)
Köster, U., et al.: Flexpoint: an adaptive numerical format for efficient training of deep neural networks. In: Advances in Neural Information Processing Systems, pp. 1742–1752 (2017)
Lakshmi, B., Dhar, A.: CORDIC architectures: a survey. VLSI Des. 2010, 2 (2010)
Maire, J.L., Brunie, N., Dinechin, F.D., Muller, J.M.: Computing floating-point logarithms with fixed-point operations. In: 2016 IEEE 23nd Symposium on Computer Arithmetic (ARITH), pp. 156–163, July 2016. https://doi.org/10.1109/ARITH.2016.24. ISSN 1063-6889
Meher, P.K., Valls, J., Juang, T.B., Sridharan, K., Maharatna, K.: 50 years of CORDIC: algorithms, architectures, and applications. IEEE Trans. Circuits Syst. I Regul. Pap. 56(9), 1893–1907 (2009). https://doi.org/10.1109/TCSI.2009.2025803
Muller, J.M.: On the definition of ulp(x). Research Report RR-5504, LIP RR-2005-09, INRIA, LIP, February 2005. https://hal.inria.fr/inria-00070503
Muller, J.M.: Elementary Functions: Algorithms and Implementation, 3 edn. Birkhäuser Basel (2016). https://www.springer.com/gp/book/9781489979810
Muller, J.M., et al.: Handbook of Floating-Point Arithmetic, 2 edn. Birkhäuser Basel (2018). https://doi.org/10.1007/978-3-319-76526-6, https://www.springer.com/gp/book/9783319765259
Nguyen, H.T., Nguyen, X.T., Hoang, T.T., Le, D.H., Pham, C.K.: Low-resource low-latency hybrid adaptive CORDIC with floating-point precision. IEICE Electron. Exp. 12(9), 20150258–20150258 (2015)
Payne, M.H., Hanek, R.N.: Radian reduction for trigonometric functions. ACM SIGNUM Newslett. 18(1), 19–24 (1983)
Tulloch, A., Jia, Y.: High performance ultra-low-precision convolutions on mobile devices. arXiv preprint arXiv:1712.02427 (2017)
Volder, J.E.: The CORDIC trigonometric computing technique. IRE Trans. Electron. Comput. 3, 330–334 (1959)
Acknowledgments
We thank Shachar Lovett for his valuable input; John Gustafson for his comments. We thank Laura Ferguson for her assistance.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Khankin, D., Raz, E., Tayari, I. (2020). Efficient CORDIC-Based Sine and Cosine Implementation for a Dataflow Architecture. In: Dolev, S., Kolesnikov, V., Lodha, S., Weiss, G. (eds) Cyber Security Cryptography and Machine Learning. CSCML 2020. Lecture Notes in Computer Science(), vol 12161. Springer, Cham. https://doi.org/10.1007/978-3-030-49785-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-49785-9_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49784-2
Online ISBN: 978-3-030-49785-9
eBook Packages: Computer ScienceComputer Science (R0)