Abstract
In this paper, the authors propose a technique for efficient data parallelism in neural processing units through different dimensional data subsets and redistribution of similar operations between code segments that are executed in parallel. The authors observe a combined approach to optimize a solution of the one-dimensional optimization problem. The authors also consider a category of the neural processor bit depth, based on dynamic programming methods. Empirical study proves that the application of the method offered can improve significantly overall program instruction per second by 5–14%, depending on a complexity class of decision problem and the degree of operation homogeneity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Boyer, L.L., Pawley, G.S.: Molecular dynamics of clusters of particles interacting with pairwise forces using a massively parallel computer. J. Comput. Phys. 78(2), 405–423 (1988). https://doi.org/10.1016/0021-9991(88)90057-5
Singh, H., Lee, M.H., Lu, G., Kurdahi, F.J., Bagherzadeh, N., Filho, E.M.C.: MorphoSys: an integrated reconfigurable system for data-parallel and computation-intensive applications. IEEE Trans. Comput. 49(5), 465–481 (2000). https://doi.org/10.1109/12.859540
Hillis, W.D., Steele Jr., G.L.: Data parallel algorithms. Commun. ACM 29(12), 1170–1183 (1986)
Flanders, P.M., Hunt, D.J., Reddaway, S.F., Parkinson, D.: Efficient high speed computing with the distributed array processor. In: High Speed Computer and Algorithm Organization, pp. 113–128 (1977)
Ebeling, C., Cronquist, D.C., Franklin, P.: Configurable computing: the catalyst for high-performance architectures. In: IEEE International Conference on Application-Specific Systems, Architectures and Processors, pp. 364–372 (1997). https://doi.org/10.1109/asap.1997.606841
Pan, V., Reif, J.: Efficient parallel solution of linear systems. In: Proceedings of the Seventeenth Annual ACM Symposium on Theory of Computing, pp. 143–152 (1985)
Lim, W (ed.): Fast Algorithms for Labeling Connected Components in 2-D arrays. Thinking Machines Corporation (1987)
Kong, H.T., Lieserson, C.E.: Algorithms for VLSI processor arrays. In: Introduction to VLSI Systems, pp 271–292. Addison-Wesley, New York (1980)
Romanchuk, V.A.: The method of optimization of neuro-based concurrent operations in neurocomputers. In: IOP Conference Series: Materials Science and Engineering, vol. 177, no. 1, p. 012033 (2017). https://doi.org/10.1088/1757-899x/177/1/012033
Chen, D.C., Rabaey, J.M.: A reconfigurable multiprocessor IC for rapid prototyping of algorithmic-specific high-speed DSP data paths. IEEE J. Solid-State Circuits 27(12), 1895–1904 (1992). https://doi.org/10.1109/4.173120
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010, pp. 177–186 (2010)
Bottou, L.: Stochastic gradient descent tricks. In: Neural Networks: Tricks of the Trade, pp. 421–436 (2012)
Noel, C., Osindero, S.: Dogwild!-distributed hogwild for CPU & GPU. In: NIPS Workshop on Distributed Machine Learning and Matrix Computations (2014)
Recht, B., Re, C., Wright, S., Niu, F.: Hogwild: a lock-free approach to parallelizing stochastic gradient descent. In: Advances in Neural Information Processing Systems, pp. 693–701 (2011)
Jia, Y.Q.C.: An open source convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678 (2013)
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. In: Foundations and Trends® in Machine Learning, vol. 3, no. 1, pp. 1–122 (2011). https://doi.org/10.1561/2200000016
Gilmore, P.C., Gomory, R.E.: The theory and computation of knapsack functions. Oper. Res. 14(6), 1045–1074 (1966). https://doi.org/10.1287/opre.14.6.1045
Martello, S., Toth, P.: Knapsacks problems: algorithms and computer implementations. Wiley, Chichester (1990)
Kryuchkovsky, V.V., Usov, A.V.: Determinization of the multifactorial evaluation model for various types of uncertainty in setting parameters. In: Proceedings of Odessa National Polytechnic University, vol. 2, pp. 154–160 (2009)
Berezovsky, B.A., Baryshnikov, Y.M., Bozenko, V.I., Kempner, L.M.: Multicriteria optimization: mathematical aspects. Nauka, Moscow (1989)
Vasin, A.Y., Zadorozhny, V.N.: Solution of the production-related issue of one-dimensional cutting materials. Omsk Sci. Bull. 2, 267–270 (2012)
Goswami, S., Chakraborty, S., Saha, H.N.: An univariate feature elimination strategy for clustering based on metafeatures. Int. J. Intell. Syst. Appl. 9(10), 20–30 (2017). https://doi.org/10.5815/ijisa.2017.10.03
Barabash, O., Kravchenko, Y., Mukhin, V., Kornaga, Y., Leshchenko, O.: Optimization of parameters at SDN technologie networks. Int. J. Intell. Syst. Appl. 9(9), 1–9 (2017). https://doi.org/10.5815/ijisa.2017.09.01
Yakkali, R.T., Raghava, N.S.: Neural network synchronous binary counter using hybrid algorithm training. Int. J. Image Graphics Sign. Process. 9(10), 38–49 (2017). https://doi.org/10.5815/ijigsp.2017.10.05
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Romanchuk, V.A., Bazhenov, R.I. (2020). The Technique for Data Parallelism in Neural Processing Units. In: Hu, Z., Petoukhov, S., He, M. (eds) Advances in Artificial Systems for Medicine and Education II. AIMEE2018 2018. Advances in Intelligent Systems and Computing, vol 902. Springer, Cham. https://doi.org/10.1007/978-3-030-12082-5_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-12082-5_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-12081-8
Online ISBN: 978-3-030-12082-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)