Load Balance Based Job Scheduling in Geo-Distributed Clouds
- 51 Downloads
In the wake of rapid development of Internet, more and more people could access it from any places in the world, which leads to the characteristic of geographical distribution of the data. Only one single cloud cannot deal with such data efficiently due to high delay and transmission cost. The geo-distributed clouds can alleviate it. However, because of the varied locations of geo-distributed clouds, how to balance the workloads of geo-distributed clouds is a crucial problem. In this paper, an efficient load balance based job scheduling in geo-distributed clouds is proposed in order to minimize the average waiting time, average response time of jobs and improve the system throughput. First, the clouds are divided into idle or busy state to get the job execution time in each cloud by Logistic regression. Then, the job scheduling problem is modeled as a \( M/M/C \) queue in each cloud. In addition, Lagrange Multiplier is given to derive the optimal job arrival rate of each cloud. Finally, the experimental results show that our proposed algorithm in this paper can decrease the average waiting time, average execution time and average response time of jobs, and improve system throughput.
KeywordsGeo-distributed clouds Queuing theory Job scheduling Logistic regression
The work was supported by the National Natural Science Foundation (NSF) under grants (No.61672397, No. 61873341), Application Foundation Frontier Project of WuHan (No. 2018010401011290), the Opening Project of State Key Laboratory of Digital Publishing Technology, Opening Project of Jiangsu Key Laboratory of Meteorological Observation and Information Processing. Any opinions, findings, and conclusions are those of the authors and do not necessarily reflect the views of the above agencies.
- 1.Data Center Location. http://www.google.com/about/datacenters/inside/locations/index.html. Accessed Mar 2017.
- 7.Vulimiri, A., Curino, C., Godfrey, P. B., et al. (2015). Global analytics in the face of bandwidth and regulatory constraints. In The 12th USENIX symposium on networked systems design and implementation (pp. 323–336).Google Scholar
- 8.Zhao, J., Li, H., Wu, C., et al. (2014). Dynamic pricing and profit maximization for the cloud with geo-distributed data centers. In 2014 33rd IEEE conference on computer communications (pp. 118–126). IEEE INFOCOM.Google Scholar
- 10.Li, W., Xu, R., Qi, H., et al. (2017). Optimizing the cost-performance tradeoff for geo-distributed data analytics with uncertain demand. In 2017 IEEE/ACM 25th international symposium on quality of service (pp. 1–6). IEEE.Google Scholar
- 14.Hu, Z., Li, B., & Luo, J. (2016). Flutter: Scheduling tasks closer to data across geo-distributed datacenters. In 2016 35th annual IEEE international conference on computer communications (pp. 1–9). IEEE.Google Scholar
- 15.Convolbo, M. W., Chou, J., Lu, S., et al. (2016). DRASH: A data replication-aware scheduler in geo-distributed data centers. In 2016 8th IEEE international conference on cloud computing technology and science (pp. 302–309). IEEE.Google Scholar
- 16.Jin, Y., Gao, Y., Qian, Z., et al. (2016). Workload-aware scheduling across geo-distributed data centers. In IEEE TrustCom/BigDataSE/ISPA 2016 (pp. 1455–1462). IEEE.Google Scholar
- 17.Li, P., Miyazaki, T., & Guo, S. (2017). Traffic-aware task placement with guaranteed job completion time for geo-distributed big data. In 2017 IEEE international conference on communications (pp. 1–6). IEEE.Google Scholar
- 19.Zhou, X., Wang, K., Jia, W., et al. (2017). Reinforcement learning-based adaptive resource management of differentiated services in geo-distributed data centers. In 2017 IEEE/ACM 25th international symposium on quality of service (pp. 1–6). IEEE.Google Scholar
- 21.Mahmud, A. H., & Iyengar, S. S. (2016). A distributed framework for carbon and cost aware geographical job scheduling in a hybrid data center infrastructure. In 2016 IEEE international conference on autonomic computing (ICAC) (pp. 75–84). IEEE.Google Scholar
- 22.He, H., & Shen, H. (2016). Green-aware online resource allocation for geo-distributed cloud data centers on multi-source energy. In 2016 17th international conference on parallel and distributed computing, applications and technologies (pp. 113–118). IEEE.Google Scholar
- 23.http://snap.stanford.edu/data/index.html. Accessed Mar 2017.
- 25.Hung, C. C., Golubchik, L., & Yu, M. (2015). Scheduling jobs across geo-distributed datacenters. In 2015 6th ACM symposium on cloud computing (pp. 111–124). ACM.Google Scholar