Atari Games and Intel Processors

  • Robert Adamski
  • Tomasz Grel
  • Maciej Klimek
  • Henryk Michalewski
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 818)

Abstract

The asynchronous nature of the state-of-the-art reinforcement learning algorithms such as the Asynchronous Advantage Actor-Critic algorithm, makes them exceptionally suitable for CPU computations. However, given the fact that deep reinforcement learning often deals with interpreting visual information, a large part of the train and inference time is spent performing convolutions.

In this work we present our results on learning strategies in Atari games using a Convolutional Neural Network, the Math Kernel Library and TensorFlow framework. We also analyze effects of asynchronous computations on the convergence of reinforcement learning algorithms.

Notes

Acknowledgments

This research was supported in part by PL-Grid Infrastructure, grant identifier openaigym.

References

  1. 1.
  2. 2.
    Intel Xeon Phi delivers competitive performance for deep learning—and getting better fast, December 2016. https://software.intel.com/en-us/articles/intel-xeon-phi-delivers-competitive-performance-for-deep-learning-and-getting-better-fast
  3. 3.
    Caffe Optimized for Intel Architecture: Applying modern code techniques, February 2017. https://software.intel.com/en-us/articles/caffe-optimized-for-intel-architecture-applying-modern-code-techniques
  4. 4.
    FALCON Library: Fast image convolution in neural networks on Intel architecture, February 2017. https://colfaxresearch.com/falcon-library/
  5. 5.
    Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). http://tensorflow.org/, software available from tensorflow.org
  6. 6.
    Adamski, I., Grel, T., Jȩdrych, A., Kaczmarek, K., Michalewski, H.: Solving Atari games with distributed reinforcement learning, October 2017. https://blog.deepsense.ai/solving-atari-games-with-distributed-reinforcement-learning/
  7. 7.
    Babaeizadeh, M., Frosio, I., Tyree, S., Clemons, J., Kautz, J.: GA3C: GPU-based A3C for deep reinforcement learning. CoRR abs/1611.06256 (2016). http://arxiv.org/abs/1611.06256
  8. 8.
    Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13(1), 281–305 (2012)MathSciNetMATHGoogle Scholar
  9. 9.
    Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: Openai gym. CoRR abs/1606.01540 (2016). http://arxiv.org/abs/1606.01540
  10. 10.
    Dubey, P.: Myth busted: general purpose CPUs can’t tackle deep neural network training, June 2016. https://itpeernetwork.intel.com/myth-busted-general-purpose-cpus-cant-tackle-deep-neural-network-training/
  11. 11.
    Guennebaud, G., Jacob, B., et al.: Eigen v3 (2010). http://eigen.tuxfamily.org
  12. 12.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R.B., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. CoRR abs/1408.5093 (2014). http://arxiv.org/abs/1408.5093
  13. 13.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014). http://arxiv.org/abs/1412.6980
  14. 14.
    Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T.P., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. CoRR abs/1602.01783 (2016). http://arxiv.org/abs/1602.01783
  15. 15.
    Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M.A., Fidjeland, A., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRefGoogle Scholar
  16. 16.
    Mulnix, D.: Intel Xeon processor E5–2600 V4 product family technical overview, January 2017. https://software.intel.com/en-us/articles/intel-xeon-processor-e5-2600-v4-product-family-technical-overview
  17. 17.
    Nair, A., Srinivasan, P., Blackwell, S., Alcicek, C., Fearon, R., Maria, A.D., Panneershelvam, V., Suleyman, M., Beattie, C., Petersen, S., Legg, S., Mnih, V., Kavukcuoglu, K., Silver, D.: Massively parallel methods for deep reinforcement learning. CoRR abs/1507.04296 (2015). http://arxiv.org/abs/1507.04296
  18. 18.
    Niu, F., Recht, B., Re, C., Wright, S.J.: HOGWILD!: a lock-free approach to parallelizing stochastic gradient descent. arXiv e-prints, June 2011Google Scholar
  19. 19.
    O’Connor, M.: Deep learning episode 4: supercomputer vs pong II, October 2016. https://www.allinea.com/blog/201610/deep-learning-episode-4-supercomputer-vs-pong-ii
  20. 20.
    Pirogov, V.: Introducing DNN primitives in Intel Math Kernel Library, March 2017. https://software.intel.com/en-us/articles/introducing-dnn-primitives-in-intelr-mkl
  21. 21.
    Salimans, T., Ho, J., Chen, X., Sutskever, I.: Evolution strategies as a scalable alternative to reinforcement learning, March 2017. https://arxiv.org/abs/1703.03864
  22. 22.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning - An Introduction. Adaptive Computation and Machine Learning. MIT Press, Cambridge (1998)Google Scholar
  23. 23.
    Ould-Ahmed Vall, E.: Optimizing Tensorflow on Intel architecture for AI applications, March 2017. https://itpeernetwork.intel.com/tensorflow-intel-architecture-ai/
  24. 24.
    Wu, Y.: Tensorpack (2016). https://github.com/ppwwyyxx/tensorpack
  25. 25.
    You, Y., Zhang, Z., Hsieh, C.J., Demmel, J.: 100-epoch ImageNet training with AlexNet in 24 Minutes. arXiv e-prints, September 2017Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Robert Adamski
    • 1
    • 2
    • 3
    • 4
  • Tomasz Grel
    • 1
    • 2
    • 3
    • 4
  • Maciej Klimek
    • 1
    • 2
    • 3
    • 4
  • Henryk Michalewski
    • 1
    • 2
    • 3
    • 4
  1. 1.IntelGdanskPoland
  2. 2.deepsense.ioWarsawPoland
  3. 3.University of WarsawWarsawPoland
  4. 4.Institute of Mathematics of the Polish Academy of SciencesWarsawPoland

Personalised recommendations