Machine learning-based auto-scaling for containerized applications


Containers are shaping the new era of cloud applications due to their key benefits such as lightweight, very quick to launch, consuming minimum resources to run an application which reduces cost, and can be easily and rapidly scaled up/down as per workload requirements. However, container-based cloud applications require sophisticated auto-scaling methods that automatically and in a timely manner provision and de-provision cloud resources without human intervention in response to dynamic fluctuations in workload. To address this challenge, in this paper, we propose a proactive machine learning-based approach to perform auto-scaling of Docker containers in response to dynamic workload changes at runtime. The proposed auto-scaler architecture follows the commonly abstracted four steps: monitor, analyze, plan, and execute the control loop. The monitor component continuously collects different types of data (HTTP request statistics, CPU, and memory utilization) that are needed during the analysis and planning phase to determine proper scaling actions. We employ in analysis phase a concise yet fast, adaptive, and accurate prediction model based on long short-term memory (LSTM) neural network to predict future HTTP workload to determine the number of containers needed to handle requests ahead of time to eliminate delays caused by starting or stopping running containers. Moreover, in the planning phase, the proposed gradually decreasing strategy avoids oscillations which happens when scaling operations are too frequent. Experimental results using realistic workload show that the prediction accuracy of LSTM model is as accurate as auto-regression integrated moving average model but offers 600 times prediction speedup. Moreover, as compared with artificial neural network model, LSTM model performs better in terms of auto-scaler metrics related to provisioning and elastic speedup. In addition, it was observed that when LSTM model is used, the predicted workload helped in using the minimum number of replicas to handle future workload. In the experiments, the use of GDS showed promising results in keeping the desired performance at reduced cost to handle cases with sudden workload increase/decrease.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17


  1. 1.

    Varghese B, Buyya R (2018) Next generation cloud computing: new trends and research directions. Future Gener Comput Syst 79:849–861

    Article  Google Scholar 

  2. 2.

    Alouane M, El Bakkali H (2016) Virtualization in cloud computing: existing solutions and new approach. In: 2016 2nd international conference on cloud computing technologies and applications (CloudTech). IEEE, pp 116–123

  3. 3.

    Pahl C, Brogi A, Soldani J, Jamshidi P (2017) Cloud container technologies: a state-of-the-art review. IEEE Trans Cloud Comput

  4. 4.

    Gupta V, Kaur K, Kaur S (2017) Performance comparison between light weight virtualization using docker and heavy weight virtualization, vol 2, pp 211–216

  5. 5.

    Bernstein D (2014) Containers and cloud: from lxc to docker to kubernetes. IEEE Cloud Comput 1(3):81–84

    Article  Google Scholar 

  6. 6.

    Burns B, Grant B, Oppenheimer D, Brewer E, Wilkes J (2016) Borg, omega, and kubernetes. ACM Queue 14:70–93

    Article  Google Scholar 

  7. 7.

    Jamshidi P, Pahl C, Mendonça NC, Lewis J, Tilkov S (2018) Microservices: the journey so far and challenges ahead. IEEE Softw 35:24–35

    Article  Google Scholar 

  8. 8.

    Soldani J, Tamburri DA, Heuvel W-JVD (2018) The pains and gains of microservices: a systematic grey literature review. J Syst Softw 146:215–232

    Article  Google Scholar 

  9. 9.

    Khazaei H, Bannazadeh H, Leon-Garcia A (2017) Savi-iot: self-managing containerized iot platform. In: 2017 IEEE 5th international conference on future Internet of Things and Cloud (FiCloud), pp 227–234

  10. 10.

    Morabito R, Farris I, Iera A, Taleb T (2017) Evaluating performance of containerized iot services for clustered devices at the network edge. IEEE Internet Things J 4:1019–1030

    Article  Google Scholar 

  11. 11.

    Morabito R, Petrolo R, Loscrì V, Mitton N, Ruggeri G, Molinaro A (2017) Lightweight virtualization as enabling technology for future smart cars. In: 2017 IFIP/IEEE symposium on integrated network and service management (IM), pp 1238–1245

  12. 12.

    Buyya R, Srirama SN, Casale G, Calheiros R, Simmhan Y, Varghese B, Gelenbe E, Javadi B, Vaquero LM, Netto MAS, Toosi AN, Rodriguez MA, Llorente IM, Vimercati SDCD, Samarati P, Milojicic D, Varela C, Bahsoon R, Assuncao MDD, Rana O, Zhou W, Jin H, Gentzsch W, Zomaya AY, Shen H (2018) A manifesto for future generation cloud computing: research directions for the next decade. ACM Comput Surv 51:105:1–105:38

    Google Scholar 

  13. 13.

    Al-Dhuraibi Y, Paraiso F, Djarallah N, Merle P (2018) Elasticity in cloud computing: state of the art and research challenges. IEEE Trans Serv Comput 11:430–447

    Article  Google Scholar 

  14. 14.

    Lorido-Botran T, Miguel-Alonso J, Lozano JA (2014) A review of auto-scaling techniques for elastic applications in cloud environments. J Grid Comput 12:559–592

    Article  Google Scholar 

  15. 15.

    Aslanpour MS, Ghobaei-Arani M, Toosi AN (2017) Auto-scaling web applications in clouds: a cost-aware approach. J Netw Comput Appl 95:26–41

    Article  Google Scholar 

  16. 16.

    Huebscher MC, McCann JA (2008) A survey of autonomic computing-degrees, models, and applications. ACM Comput Surv 40:7:1–7:28

    Article  Google Scholar 

  17. 17.

    Qu C, Calheiros RN, Buyya R (2018) Auto-scaling web applications in clouds: a taxonomy and survey. ACM Comput Surv 51:73:1–73:33

    Article  Google Scholar 

  18. 18.

    Cardenas YMR (2018) Scaling policies derivation for predictive autoscaling of cloud applications. Master’s thesis, University of Munich

  19. 19.

    Klinaku F, Frank M, Becker S (2018) Caus: an elasticity controller for a containerized microservice. In: Companion of the 2018 ACM/SPEC international conference on performance engineering, ICPE ’18, New York. ACM, pp 93–98

  20. 20.

    Al-Dhuraibi Y, Paraiso F, Djarallah N, Merle P (2017) Autonomic vertical elasticity of docker containers with elasticdocker. In: 2017 IEEE 10th international conference on cloud computing (CLOUD), pp 472–479

  21. 21.

    Taherizadeh S, Stankovski V (2018) Dynamic multi-level auto-scaling rules for containerized applications. Comput J 62:174–197

    Article  Google Scholar 

  22. 22.

    Zhang F, Tang X, Li X, Khan SU, Li Z (2019) Quantifying cloud elasticity with container-based autoscaling. Future Gener Comput Syst 98:672–681

    Article  Google Scholar 

  23. 23.

    Kan C (2016) Docloud: an elastic cloud platform for web applications based on docker. In: 2016 18th international conference on advanced communication technology (ICACT), p 1

  24. 24.

    Li Y, Xia Y (2016) Auto-scaling web applications in hybrid cloud based on docker. In: 2016 5th International conference on computer science and network technology (ICCSNT), pp 75–79

  25. 25.

    Kubernetes horizontal pod auto-scaling.’ Accessed 19 April 2019

  26. 26.

    Ciptaningtyas HT, Santoso BJ, Razi MF (2017) Resource elasticity controller for docker-based web applications. In: 11th international conference on information & communication technology and system (ICTS), pp 193–196

  27. 27.

    Meng Y, Rao R, Zhang X, Hong P (2016) Crupa: a container resource utilization prediction algorithm for auto-scaling based on time series analysis. In: 2016 International conference on progress in informatics and computing (PIC), pp 468–472

  28. 28.

    Kim W-Y, Lee J-S, Huh E-N (2017) Study on proactive auto scaling for instance through the prediction of network traffic on the container environment. In: Proceedings of the 11th international conference on ubiquitous information management and communication, IMCOM ’17, New York, NY, USA. ACM, pp 17:1–17:8

  29. 29.

    Borkowski M, Schulte S, Hochreiner C (2016) Predicting cloud resource utilization. In: 2016 IEEE/ACM 9th international conference on utility and cloud computing (UCC), pp 37–42

  30. 30.

    Sangpetch A, Sangpetch O, Juangmarisakul N, Warodom S (2017) Thoth: automatic resource management with machine learning for container-based cloud platform, pp 103–111

  31. 31.

    Pouyanfar S, Sadiq S, Yan Y, Tian H, Tao Y, Reyes MP, Shyu M-L, Chen S-C, Iyengar S (2018) A survey on deep learning: algorithms, techniques, and applications. ACM Comput Surv (CSUR) 51(5):92

    Google Scholar 

  32. 32.

    Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  33. 33.

    Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. In: Proceedings of the 30th international conference on international conference on machine learning, vol 28, ICML’13,, pp III-1310–III-1318

  34. 34.

    Ye T, Guangtao X, Shiyou Q, Minglu L (2017) An auto-scaling framework for containerized elastic applications. In: 2017 3rd international conference on big data computing and communications (BIGCOM), pp 422–430

  35. 35.

    Box GEP, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control, 5th edn. Wiley, Hoboken

    Google Scholar 

  36. 36.

    Baresi L, Guinea S, Leva A, Quattrocchi G (2016) A discrete-time feedback controller for containerized cloud applications. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering, FSE 2016, New York, NY, USA. ACM, pp 217–228

  37. 37.

    Wu S, Zhang D, Yan B, Guo F, Sheng D (2018) Auto-scaling web application in docker based on gray prediction. In: 2018 International conference on network, communication, computer engineering (NCCE 2018). Atlantis Press, 2018/05, pp 169–174

  38. 38.

    Chiang JS, Wu PL, Chiang SD, Chang TJ, Chang ST, Wen KL (1998) Introduction of grey system theory. GAO-Li Publication, Taiwan

    Google Scholar 

  39. 39.

    Watkins CJCH, Dayan P (1992) Q-learning. In: Machine learning, pp 279–292

  40. 40.

    Truyen E, Van Landuyt D, Preuveneers D, Lagaisse B, Joosen W (2019) A comprehensive feature comparison study of open-source container orchestration frameworks. Appl Sci 9(5)

  41. 41.

    Haproxy—the reliable, high-performance tcp/http load balancer. Accessed 16 April 2019

  42. 42.

    Arlitt M, Jin T (2000) A workload characterization study of the 1998 world cup web site. IEEE Netw 14(3):30–37

    Article  Google Scholar 

  43. 43.

    Bauer A, Herbst N, Spinner S, Ali-Eldin A, Kounev S (2018) Chameleon: a hybrid, proactive auto-scaling mechanism on a level-playing field. IEEE Trans Parallel Distrib Syst 30(4):800–813

    Article  Google Scholar 

  44. 44.

    Messias VR, Estrella JC, Ehlers R, Santana MJ, Santana RC, Reiff-Marganiec S (2016) Combining time series prediction models using genetic algorithm to autoscaling web applications hosted in the cloud infrastructure. Neural Comput Appl 27(8):2383–2406

    Article  Google Scholar 

  45. 45.

    Roy N, Dubey A, Gokhale A (2011) Efficient autoscaling in the cloud using predictive models for workload forecasting. In: 2011 IEEE 4th international conference on cloud computing. IEEE, pp 500–507

  46. 46.

    Moore LR, Bean K, Ellahi T (2013) Transforming reactive auto-scaling into proactive auto-scaling. In: Proceedings of the 3rd international workshop on cloud data and platforms. ACM, pp 7–12

  47. 47.

    Herbst N, Krebs R, Oikonomou G, Kousiouris G, Evangelinou A, Iosup A, Kounev S (2016) Ready for rain? A view from spec research on the future of cloud metrics. arXiv:1604.03470

  48. 48.

    Bauer A, Grohmann J, Herbst N, Kounev S (2018) On the value of service demand estimation for auto-scaling. In: International conference on measurement, modelling and evaluation of computing systems. Springer, New York, pp 142–156

Download references


The authors would like to thank the anonymous reviewers for their invaluable comments that definitely improved the overall quality of the manuscript.

Author information



Corresponding author

Correspondence to Mohammad Gh. Alfailakawi.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Imdoukh, M., Ahmad, I. & Alfailakawi, M.G. Machine learning-based auto-scaling for containerized applications. Neural Comput & Applic 32, 9745–9760 (2020).

Download citation


  • Containerization
  • Auto-scaling
  • Proactive controller
  • Prediction
  • Neural network
  • Long short-term memory