The Technique for Data Parallelism in Neural Processing Units

Romanchuk, Vitaliy A.; Bazhenov, Ruslan I.

doi:10.1007/978-3-030-12082-5_4

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 902))

Included in the following conference series:

International Conference of Artificial Intelligence, Medical Engineering, Education

986 Accesses

Abstract

In this paper, the authors propose a technique for efficient data parallelism in neural processing units through different dimensional data subsets and redistribution of similar operations between code segments that are executed in parallel. The authors observe a combined approach to optimize a solution of the one-dimensional optimization problem. The authors also consider a category of the neural processor bit depth, based on dynamic programming methods. Empirical study proves that the application of the method offered can improve significantly overall program instruction per second by 5–14%, depending on a complexity class of decision problem and the degree of operation homogeneity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Boyer, L.L., Pawley, G.S.: Molecular dynamics of clusters of particles interacting with pairwise forces using a massively parallel computer. J. Comput. Phys. 78(2), 405–423 (1988). https://doi.org/10.1016/0021-9991(88)90057-5
Article MATH Google Scholar
Singh, H., Lee, M.H., Lu, G., Kurdahi, F.J., Bagherzadeh, N., Filho, E.M.C.: MorphoSys: an integrated reconfigurable system for data-parallel and computation-intensive applications. IEEE Trans. Comput. 49(5), 465–481 (2000). https://doi.org/10.1109/12.859540
Article Google Scholar
Hillis, W.D., Steele Jr., G.L.: Data parallel algorithms. Commun. ACM 29(12), 1170–1183 (1986)
Article Google Scholar
Flanders, P.M., Hunt, D.J., Reddaway, S.F., Parkinson, D.: Efficient high speed computing with the distributed array processor. In: High Speed Computer and Algorithm Organization, pp. 113–128 (1977)
Google Scholar
Ebeling, C., Cronquist, D.C., Franklin, P.: Configurable computing: the catalyst for high-performance architectures. In: IEEE International Conference on Application-Specific Systems, Architectures and Processors, pp. 364–372 (1997). https://doi.org/10.1109/asap.1997.606841
Pan, V., Reif, J.: Efficient parallel solution of linear systems. In: Proceedings of the Seventeenth Annual ACM Symposium on Theory of Computing, pp. 143–152 (1985)
Google Scholar
Lim, W (ed.): Fast Algorithms for Labeling Connected Components in 2-D arrays. Thinking Machines Corporation (1987)
Google Scholar
Kong, H.T., Lieserson, C.E.: Algorithms for VLSI processor arrays. In: Introduction to VLSI Systems, pp 271–292. Addison-Wesley, New York (1980)
Google Scholar
Romanchuk, V.A.: The method of optimization of neuro-based concurrent operations in neurocomputers. In: IOP Conference Series: Materials Science and Engineering, vol. 177, no. 1, p. 012033 (2017). https://doi.org/10.1088/1757-899x/177/1/012033
Chen, D.C., Rabaey, J.M.: A reconfigurable multiprocessor IC for rapid prototyping of algorithmic-specific high-speed DSP data paths. IEEE J. Solid-State Circuits 27(12), 1895–1904 (1992). https://doi.org/10.1109/4.173120
Article Google Scholar
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010, pp. 177–186 (2010)
Google Scholar
Bottou, L.: Stochastic gradient descent tricks. In: Neural Networks: Tricks of the Trade, pp. 421–436 (2012)
Google Scholar
Noel, C., Osindero, S.: Dogwild!-distributed hogwild for CPU & GPU. In: NIPS Workshop on Distributed Machine Learning and Matrix Computations (2014)
Google Scholar
Recht, B., Re, C., Wright, S., Niu, F.: Hogwild: a lock-free approach to parallelizing stochastic gradient descent. In: Advances in Neural Information Processing Systems, pp. 693–701 (2011)
Google Scholar
Jia, Y.Q.C.: An open source convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678 (2013)
Google Scholar
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. In: Foundations and Trends® in Machine Learning, vol. 3, no. 1, pp. 1–122 (2011). https://doi.org/10.1561/2200000016
Gilmore, P.C., Gomory, R.E.: The theory and computation of knapsack functions. Oper. Res. 14(6), 1045–1074 (1966). https://doi.org/10.1287/opre.14.6.1045
Article MathSciNet MATH Google Scholar
Martello, S., Toth, P.: Knapsacks problems: algorithms and computer implementations. Wiley, Chichester (1990)
MATH Google Scholar
Kryuchkovsky, V.V., Usov, A.V.: Determinization of the multifactorial evaluation model for various types of uncertainty in setting parameters. In: Proceedings of Odessa National Polytechnic University, vol. 2, pp. 154–160 (2009)
Google Scholar
Berezovsky, B.A., Baryshnikov, Y.M., Bozenko, V.I., Kempner, L.M.: Multicriteria optimization: mathematical aspects. Nauka, Moscow (1989)
Google Scholar
Vasin, A.Y., Zadorozhny, V.N.: Solution of the production-related issue of one-dimensional cutting materials. Omsk Sci. Bull. 2, 267–270 (2012)
Google Scholar
Goswami, S., Chakraborty, S., Saha, H.N.: An univariate feature elimination strategy for clustering based on metafeatures. Int. J. Intell. Syst. Appl. 9(10), 20–30 (2017). https://doi.org/10.5815/ijisa.2017.10.03
Barabash, O., Kravchenko, Y., Mukhin, V., Kornaga, Y., Leshchenko, O.: Optimization of parameters at SDN technologie networks. Int. J. Intell. Syst. Appl. 9(9), 1–9 (2017). https://doi.org/10.5815/ijisa.2017.09.01
Yakkali, R.T., Raghava, N.S.: Neural network synchronous binary counter using hybrid algorithm training. Int. J. Image Graphics Sign. Process. 9(10), 38–49 (2017). https://doi.org/10.5815/ijigsp.2017.10.05

Download references

Author information

Authors and Affiliations

Ryazan State University named for S. Yesenin, Ryazan, Russia
Vitaliy A. Romanchuk
Sholom-Aleichem Priamursky State University, Birobidzhan, Russia
Ruslan I. Bazhenov

Authors

Vitaliy A. Romanchuk
View author publications
You can also search for this author in PubMed Google Scholar
Ruslan I. Bazhenov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruslan I. Bazhenov .

Editor information

Editors and Affiliations

School of Educational Information Technology, Central China Normal University, Wuhan, Hubei, China
Zhengbing Hu
Mechanical Engineering Research Institute, Russian Academy of Sciences, Moscow, Russia
Sergey V. Petoukhov
Halmos College of Natural Sciences and Oceanography, Nova Southeastern University, Davie, FL, USA
Matthew He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Romanchuk, V.A., Bazhenov, R.I. (2020). The Technique for Data Parallelism in Neural Processing Units. In: Hu, Z., Petoukhov, S., He, M. (eds) Advances in Artificial Systems for Medicine and Education II. AIMEE2018 2018. Advances in Intelligent Systems and Computing, vol 902. Springer, Cham. https://doi.org/10.1007/978-3-030-12082-5_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-12082-5_4
Published: 03 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-12081-8
Online ISBN: 978-3-030-12082-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics