Skip to main content

The Technique for Data Parallelism in Neural Processing Units

  • Conference paper
  • First Online:
Advances in Artificial Systems for Medicine and Education II (AIMEE2018 2018)

Abstract

In this paper, the authors propose a technique for efficient data parallelism in neural processing units through different dimensional data subsets and redistribution of similar operations between code segments that are executed in parallel. The authors observe a combined approach to optimize a solution of the one-dimensional optimization problem. The authors also consider a category of the neural processor bit depth, based on dynamic programming methods. Empirical study proves that the application of the method offered can improve significantly overall program instruction per second by 5–14%, depending on a complexity class of decision problem and the degree of operation homogeneity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Boyer, L.L., Pawley, G.S.: Molecular dynamics of clusters of particles interacting with pairwise forces using a massively parallel computer. J. Comput. Phys. 78(2), 405–423 (1988). https://doi.org/10.1016/0021-9991(88)90057-5

    Article  MATH  Google Scholar 

  2. Singh, H., Lee, M.H., Lu, G., Kurdahi, F.J., Bagherzadeh, N., Filho, E.M.C.: MorphoSys: an integrated reconfigurable system for data-parallel and computation-intensive applications. IEEE Trans. Comput. 49(5), 465–481 (2000). https://doi.org/10.1109/12.859540

    Article  Google Scholar 

  3. Hillis, W.D., Steele Jr., G.L.: Data parallel algorithms. Commun. ACM 29(12), 1170–1183 (1986)

    Article  Google Scholar 

  4. Flanders, P.M., Hunt, D.J., Reddaway, S.F., Parkinson, D.: Efficient high speed computing with the distributed array processor. In: High Speed Computer and Algorithm Organization, pp. 113–128 (1977)

    Google Scholar 

  5. Ebeling, C., Cronquist, D.C., Franklin, P.: Configurable computing: the catalyst for high-performance architectures. In: IEEE International Conference on Application-Specific Systems, Architectures and Processors, pp. 364–372 (1997). https://doi.org/10.1109/asap.1997.606841

  6. Pan, V., Reif, J.: Efficient parallel solution of linear systems. In: Proceedings of the Seventeenth Annual ACM Symposium on Theory of Computing, pp. 143–152 (1985)

    Google Scholar 

  7. Lim, W (ed.): Fast Algorithms for Labeling Connected Components in 2-D arrays. Thinking Machines Corporation (1987)

    Google Scholar 

  8. Kong, H.T., Lieserson, C.E.: Algorithms for VLSI processor arrays. In: Introduction to VLSI Systems, pp 271–292. Addison-Wesley, New York (1980)

    Google Scholar 

  9. Romanchuk, V.A.: The method of optimization of neuro-based concurrent operations in neurocomputers. In: IOP Conference Series: Materials Science and Engineering, vol. 177, no. 1, p. 012033 (2017). https://doi.org/10.1088/1757-899x/177/1/012033

  10. Chen, D.C., Rabaey, J.M.: A reconfigurable multiprocessor IC for rapid prototyping of algorithmic-specific high-speed DSP data paths. IEEE J. Solid-State Circuits 27(12), 1895–1904 (1992). https://doi.org/10.1109/4.173120

    Article  Google Scholar 

  11. Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010, pp. 177–186 (2010)

    Google Scholar 

  12. Bottou, L.: Stochastic gradient descent tricks. In: Neural Networks: Tricks of the Trade, pp. 421–436 (2012)

    Google Scholar 

  13. Noel, C., Osindero, S.: Dogwild!-distributed hogwild for CPU & GPU. In: NIPS Workshop on Distributed Machine Learning and Matrix Computations (2014)

    Google Scholar 

  14. Recht, B., Re, C., Wright, S., Niu, F.: Hogwild: a lock-free approach to parallelizing stochastic gradient descent. In: Advances in Neural Information Processing Systems, pp. 693–701 (2011)

    Google Scholar 

  15. Jia, Y.Q.C.: An open source convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678 (2013)

    Google Scholar 

  16. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. In: Foundations and Trends® in Machine Learning, vol. 3, no. 1, pp. 1–122 (2011). https://doi.org/10.1561/2200000016

  17. Gilmore, P.C., Gomory, R.E.: The theory and computation of knapsack functions. Oper. Res. 14(6), 1045–1074 (1966). https://doi.org/10.1287/opre.14.6.1045

    Article  MathSciNet  MATH  Google Scholar 

  18. Martello, S., Toth, P.: Knapsacks problems: algorithms and computer implementations. Wiley, Chichester (1990)

    MATH  Google Scholar 

  19. Kryuchkovsky, V.V., Usov, A.V.: Determinization of the multifactorial evaluation model for various types of uncertainty in setting parameters. In: Proceedings of Odessa National Polytechnic University, vol. 2, pp. 154–160 (2009)

    Google Scholar 

  20. Berezovsky, B.A., Baryshnikov, Y.M., Bozenko, V.I., Kempner, L.M.: Multicriteria optimization: mathematical aspects. Nauka, Moscow (1989)

    Google Scholar 

  21. Vasin, A.Y., Zadorozhny, V.N.: Solution of the production-related issue of one-dimensional cutting materials. Omsk Sci. Bull. 2, 267–270 (2012)

    Google Scholar 

  22. Goswami, S., Chakraborty, S., Saha, H.N.: An univariate feature elimination strategy for clustering based on metafeatures. Int. J. Intell. Syst. Appl. 9(10), 20–30 (2017). https://doi.org/10.5815/ijisa.2017.10.03

  23. Barabash, O., Kravchenko, Y., Mukhin, V., Kornaga, Y., Leshchenko, O.: Optimization of parameters at SDN technologie networks. Int. J. Intell. Syst. Appl. 9(9), 1–9 (2017). https://doi.org/10.5815/ijisa.2017.09.01

  24. Yakkali, R.T., Raghava, N.S.: Neural network synchronous binary counter using hybrid algorithm training. Int. J. Image Graphics Sign. Process. 9(10), 38–49 (2017). https://doi.org/10.5815/ijigsp.2017.10.05

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruslan I. Bazhenov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Romanchuk, V.A., Bazhenov, R.I. (2020). The Technique for Data Parallelism in Neural Processing Units. In: Hu, Z., Petoukhov, S., He, M. (eds) Advances in Artificial Systems for Medicine and Education II. AIMEE2018 2018. Advances in Intelligent Systems and Computing, vol 902. Springer, Cham. https://doi.org/10.1007/978-3-030-12082-5_4

Download citation

Publish with us

Policies and ethics