Data-Centric Task Scheduling Algorithm for Hybrid Tasks in Cloud Data Centers

Li, Xin; Wang, Liangyuan; Abawajy, Jemal; Qin, Xiaolin

doi:10.1007/978-3-030-05054-2_47

Xin Li^16,17,18,
Liangyuan Wang¹⁶,
Jemal Abawajy¹⁹ &
…
Xiaolin Qin¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11335))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

1727 Accesses
1 Citations

Abstract

With the development of big data, a demand for data analysis keeps increasing. This requirement has prompted a need for data-aware task scheduling approach that can simultaneously schedule various tasks such as batched tasks and real-time tasks in a data center efficiently. To this end, we propose a hybrid task scheduling strategy coupled with data migration in data center. Firstly, we translate the task scheduling problem into task selection problem, and give methods of selecting batched tasks and real-time tasks respectively. Then the method for scheduling both batched tasks and real-time tasks is introduced in detail. Finally, we integrate data migration into the hybrid scheduling strategy. Experimental results show that, compared to the traditional FIFO algorithm, the proposed task scheduling strategy greatly improves the data locality and data migration performs very well on reducing the job execution time. Our algorithm also guarantees an acceptable fairness for tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Apache hadoop. http://hadoop.apache.org/
Apache pig. http://pig.apache.org/
Chen, Q., Zhang, D., Guo, M., Deng, Q., Guo, S.: SAMR: a self-adaptive mapreduce scheduling algorithm in heterogeneous environment. In: IEEE International Conference on Computer and Information Technology, pp. 2736–2743, June 2010
Google Scholar
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. In: Proceedings of USENIX OSDI, pp. 1–45 (2013)
Google Scholar
Lee, Y.C., Zomaya, A.Y.: Energy conscious scheduling for distributed computing systems under different operating conditions. IEEE Trans. Parallel Distrib. Syst. 22(8), 1374–1381 (2011)
Article Google Scholar
Li, D., Wu, J., Chang, W.: Efficient cloudlet deployment: local cooperation and regional proxy. In: International Conference on Computing, Networking and Communications, pp. 757–761, March 2018
Google Scholar
Li, X., Tatebe, O.: Data-aware task dispatching for batch queuing system. IEEE Syst. J. 11(2), 889–897 (2017)
Article Google Scholar
Li, X., Wang, L., Lian, Z., Qin, X.: Migration-based online CPSCN big data analysis in data centers. IEEE Access 6, 19270–19277 (2018)
Article Google Scholar
Li, X., Wu, J., Qian, Z., Tang, S., Lu, S.: Towards location-aware joint job and data assignment in cloud data centers with NVM. In: Proceedings of IEEE IPCCC, pp. 1–8, December 2017
Google Scholar
Shi, W., Gao, J., Zhang, Q., Li, Y., Xu, L.: Edge computing: vision and challenges. IEEE Internet Things J. 3(5), 637–646 (2016)
Article Google Scholar
Thomas, L., R, S.: Survey on mapreduce scheduling algorithms. Int. J. Comput. Appl. 95(23), 9–13 (2014)
Google Scholar
Vavilapalli, V.K., et al.: Apache hadoop yarn: yet another resource negotiator. In: Proceedings of the 4th Annual Symposium on Cloud Computing, no. 5, October 2013
Google Scholar
Wang, W., Zhu, K., Ying, L., Tan, J., Zhang, L.: Map task scheduling in mapreduce with data locality: throughput and heavy-traffic optimality. IEEE/ACM Trans. Netw. 24(1), 190–203 (2016)
Article Google Scholar
Yu, B., Pan, J.: Location-aware associated data placement for geo-distributed data-intensive applications. In: IEEE Conference on Computing Communications, pp. 603–611, April 2015
Google Scholar
Zaharia, M., Borthakur, D., Sarma, J.S., Elmeleegy, K., Shenker, S., Stoica, I.: Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In: Proceedings of the 5th European Conference on Computer Systems, pp. 265–278. ACM (2010)
Google Scholar
Zhou, Z., et al.: Minimizing SLA violation and power consumption in cloud data centers using adaptive energy-aware algorithms. Future Gen. Comput. Syst. 86, 836–850 (2018)
Article Google Scholar
Zhu, C., Zhou, H., Leung, V.C.M., Wang, K., Zhang, Y., Yang, L.T.: Toward big data in green city. IEEE Commun. Mag. 55(11), 14–18 (2017)
Article Google Scholar

Download references

Acknowledgment

This work is supported in part by the National Natural Science Foundation of China under Grant 61373015, in part by the Jiangsu Natural Science Foundation under Grant BK20160813 and BK20140832, in part by the National Key R&D Program of China under Grant 2018YFB1003902, in part by the Open Project Funded by State Key Laboratory of Computer Architecture under Grant CARCH201710, and in part by the Project Funded by China Postdoctoral Science Foundation.

Author information

Authors and Affiliations

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China
Xin Li, Liangyuan Wang & Xiaolin Qin
State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Xin Li
Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing, China
Xin Li
School of Information Technology, Deakin University, Melbourne, Australia
Jemal Abawajy

Authors

Xin Li
View author publications
You can also search for this author in PubMed Google Scholar
Liangyuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jemal Abawajy
View author publications
You can also search for this author in PubMed Google Scholar
Xiaolin Qin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xin Li .

Editor information

Editors and Affiliations

Rutgers University, Newark, NJ, USA
Jaideep Vaidya
Guangzhou University, Guangzhou, China
Jin Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, X., Wang, L., Abawajy, J., Qin, X. (2018). Data-Centric Task Scheduling Algorithm for Hybrid Tasks in Cloud Data Centers. In: Vaidya, J., Li, J. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2018. Lecture Notes in Computer Science(), vol 11335. Springer, Cham. https://doi.org/10.1007/978-3-030-05054-2_47

Download citation

DOI: https://doi.org/10.1007/978-3-030-05054-2_47
Published: 07 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05053-5
Online ISBN: 978-3-030-05054-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics