Atari Games and Intel Processors

  • Robert Adamski
  • Tomasz Grel
  • Maciej Klimek
  • Henryk MichalewskiEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 818)


The asynchronous nature of the state-of-the-art reinforcement learning algorithms such as the Asynchronous Advantage Actor-Critic algorithm, makes them exceptionally suitable for CPU computations. However, given the fact that deep reinforcement learning often deals with interpreting visual information, a large part of the train and inference time is spent performing convolutions.

In this work we present our results on learning strategies in Atari games using a Convolutional Neural Network, the Math Kernel Library and TensorFlow framework. We also analyze effects of asynchronous computations on the convergence of reinforcement learning algorithms.



This research was supported in part by PL-Grid Infrastructure, grant identifier openaigym.


  1. 1.
  2. 2.
    Intel Xeon Phi delivers competitive performance for deep learning—and getting better fast, December 2016.
  3. 3.
    Caffe Optimized for Intel Architecture: Applying modern code techniques, February 2017.
  4. 4.
    FALCON Library: Fast image convolution in neural networks on Intel architecture, February 2017.
  5. 5.
    Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: large-scale machine learning on heterogeneous systems (2015)., software available from
  6. 6.
    Adamski, I., Grel, T., Jȩdrych, A., Kaczmarek, K., Michalewski, H.: Solving Atari games with distributed reinforcement learning, October 2017.
  7. 7.
    Babaeizadeh, M., Frosio, I., Tyree, S., Clemons, J., Kautz, J.: GA3C: GPU-based A3C for deep reinforcement learning. CoRR abs/1611.06256 (2016).
  8. 8.
    Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13(1), 281–305 (2012)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: Openai gym. CoRR abs/1606.01540 (2016).
  10. 10.
    Dubey, P.: Myth busted: general purpose CPUs can’t tackle deep neural network training, June 2016.
  11. 11.
    Guennebaud, G., Jacob, B., et al.: Eigen v3 (2010).
  12. 12.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R.B., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. CoRR abs/1408.5093 (2014).
  13. 13.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014).
  14. 14.
    Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T.P., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. CoRR abs/1602.01783 (2016).
  15. 15.
    Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M.A., Fidjeland, A., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRefGoogle Scholar
  16. 16.
    Mulnix, D.: Intel Xeon processor E5–2600 V4 product family technical overview, January 2017.
  17. 17.
    Nair, A., Srinivasan, P., Blackwell, S., Alcicek, C., Fearon, R., Maria, A.D., Panneershelvam, V., Suleyman, M., Beattie, C., Petersen, S., Legg, S., Mnih, V., Kavukcuoglu, K., Silver, D.: Massively parallel methods for deep reinforcement learning. CoRR abs/1507.04296 (2015).
  18. 18.
    Niu, F., Recht, B., Re, C., Wright, S.J.: HOGWILD!: a lock-free approach to parallelizing stochastic gradient descent. arXiv e-prints, June 2011Google Scholar
  19. 19.
    O’Connor, M.: Deep learning episode 4: supercomputer vs pong II, October 2016.
  20. 20.
    Pirogov, V.: Introducing DNN primitives in Intel Math Kernel Library, March 2017.
  21. 21.
    Salimans, T., Ho, J., Chen, X., Sutskever, I.: Evolution strategies as a scalable alternative to reinforcement learning, March 2017.
  22. 22.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning - An Introduction. Adaptive Computation and Machine Learning. MIT Press, Cambridge (1998)Google Scholar
  23. 23.
    Ould-Ahmed Vall, E.: Optimizing Tensorflow on Intel architecture for AI applications, March 2017.
  24. 24.
    Wu, Y.: Tensorpack (2016).
  25. 25.
    You, Y., Zhang, Z., Hsieh, C.J., Demmel, J.: 100-epoch ImageNet training with AlexNet in 24 Minutes. arXiv e-prints, September 2017Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Robert Adamski
    • 1
    • 2
    • 3
    • 4
  • Tomasz Grel
    • 1
    • 2
    • 3
    • 4
  • Maciej Klimek
    • 1
    • 2
    • 3
    • 4
  • Henryk Michalewski
    • 1
    • 2
    • 3
    • 4
    Email author
  1. 1.IntelGdanskPoland
  2. 2.deepsense.ioWarsawPoland
  3. 3.University of WarsawWarsawPoland
  4. 4.Institute of Mathematics of the Polish Academy of SciencesWarsawPoland

Personalised recommendations