ThunderML: A Toolkit for Enabling AI/ML Models on Cloud for Industry 4.0

  • Shrey ShrivastavaEmail author
  • Dhaval Patel
  • Wesley M. Gifford
  • Stuart Siegel
  • Jayant Kalagnanam
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11512)


AI, machine learning, and deep learning tools have now become easily accessible on the cloud. However, the adoption of these cloud-based services for heavy industries has been limited due to the gap between general purpose AI tools and operational requirements for production industries. There are three fundamentals gaps. The first is the lack of purpose built solution pipelines designed for common industrial problem types, the second is the lack of tools for automating the learning from noisy sensor data and the third is the lack of platforms which help practitioners leverage cloud-based environment for building and deploying custom modeling pipelines. In this paper, we present ThunderML, a toolkit that addresses these gaps by providing powerful programming model that allows rapid authoring, training and deployment for Industry 4.0 applications. Importantly, the system also facilitates cloud-based deployments by providing a vendor agnostic pipeline execution and deployment layer.


Cognitive computing IoT sensor data Machine learning Deep learning Purpose built AI pipelines 


  1. 1.
    Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: 12th \(\{\)USENIX\(\}\) Symposium on Operating Systems Design and Implementation (\(\{\)OSDI\(\}\) 2016), pp. 265–283 (2016)Google Scholar
  2. 2.
    Amazon: Machine Learning on AWS.
  3. 3.
    Ardagna, C.A., Bellandi, V., Ceravolo, P., Damiani, E., Bezzi, M., Hébert, C.: A model-driven methodology for big data analytics-as-a-service. In: IEEE International Congress on Big Data, BigData Congress, pp. 105–112 (2017)Google Scholar
  4. 4.
    Bergstra, J., et al.: Theano: a CPU and GPU math compiler in python. In: Proceedings 9th Python in Science Conference, vol. 1, pp. 3–10 (2010)Google Scholar
  5. 5.
    Cheng, Y., Hao, Z., Cai, R., Wen, W.: HPC2-ARS: an architecture for real-time analytic of big data streams. In: IEEE International Conference on Web Services, ICWS, pp. 319–322 (2018)Google Scholar
  6. 6.
    Collobert, R., Bengio, S., Mariéthoz, J.: Torch: a modular machine learning software library. Technical report, Technical report IDIAP-RR 02–46, IDIAP (2002)Google Scholar
  7. 7.
    Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems 28, pp. 2962–2970 (2015)Google Scholar
  8. 8.
    Google: Cloud AI Products.
  9. 9.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  10. 10.
    IBM: Watson Machine Learning.
  11. 11.
    Kristpapadopoulos, K.: SeriesNet (2018)Google Scholar
  12. 12.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks (2012)Google Scholar
  13. 13.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRefGoogle Scholar
  14. 14.
    Liao, Y., Deschamps, F., de reitas Rocha Loures, E., Ramos, L.F.P.: Past, present and future of industry 4.0 - a systematic literature review and research agenda proposal. Int. J. Prod. Res. 55(12), 3609–3629 (2017)CrossRefGoogle Scholar
  15. 15.
    Microsoft: Microsoft Azure Machine Learning Studio.
  16. 16.
    Olson, R.S., Bartley, N., Urbanowicz, R.J., Moore, J.H.: Evaluation of a tree-based pipeline optimization tool for automating data science. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, GECCO 2016, pp. 485–492 (2016)Google Scholar
  17. 17.
    Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  18. 18.
    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  19. 19.
    Van Den Oord, A., et al.: WaveNet: a generative model for raw audio. In: SSW, p. 125 (2016)Google Scholar
  20. 20.
    Zhang, P., Wang, H., Ding, B., Shang, S.: Cloud-based framework for scalable and real-time multi-robot SLAM. In: IEEE International Conference on Web Services, ICWS, pp. 147–154 (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Shrey Shrivastava
    • 1
    Email author
  • Dhaval Patel
    • 1
  • Wesley M. Gifford
    • 1
  • Stuart Siegel
    • 1
  • Jayant Kalagnanam
    • 1
  1. 1.IBM ResearchYorktown HeightsUSA

Personalised recommendations