Efficiently Scheduling Hadoop Cluster in Cloud Environment

Raghuram, D.; Gayathri, N.; Sudha Sadasivam, G.

doi:10.1007/978-81-322-1916-3_10

D. Raghuram⁴,
N. Gayathri⁴ &
G. Sudha Sadasivam⁴

572 Accesses

Abstract

Today, most of the real-time applications like bioinformatics and image processing involve processing of large amounts of unstructured data that requires fast, memory-consuming, and highly efficient resources. This problem has been resolved by the introduction of cloud, which is now the most favored option for big-data analytics. Hadoop, a framework for manipulating unstructured data, is used for this purpose. The nodes that form the Hadoop cluster are scheduled randomly in Amazon cloud. Since huge amounts of data need to be transferred among these nodes, the time taken to upload and process the data is quite high, thereby decreasing the performance. The further focus of service providers is on maximizing resource utilization and minimizing power consumption. This chapter aims at designing an energy-efficient scheduler for a cloud environment that will be suitable for the big-data applications. The working of the scheduler has been tested in OpenStack cloud environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Cloud computing. en.wikipedia.org/wiki/Cloud_computing (25 Aug 2013)
Introduction to OpenStack and its components. http://docs.openstack.org/ (20 Sept 2013)
Tom, W.: Hadoop: The Definitive Guide. O’Reilly Media, Beijing (2012)
Google Scholar
Pei Fan, Ji Wang, Zibin Zheng, Lyu, M.R.: Toward optimal deployment of communication-intensive cloud applications. In: IEEE International Conference on Cloud Computing, pp. 460–467. (2011)
Google Scholar
Gihun Jung, Kwang Mong Sim: Location-aware dynamic resource allocation model for Cloud computing environment. In: International Conference on Information and Computer Applications, Hong Kong, pp. 37–41. (2012)
Google Scholar
Beloglazov, A., Buyya, R., Lee, Y.C., Zomaya, A.: A taxonomy and survey of energy-efficient data centers and Cloud computing systems. Technical Report, Cloud Computing and Distributed Systems Laboratory, (2010)
Google Scholar
Ching-Hsien Hsu, Slagter, K.D., Shih-Chang Chen, Yeh-Ching Chung: Optimizing energy consumption with task consolidation in clouds. Elsevier: Inf. Sci. 258, 452–462 (2014)
Google Scholar
Beloglazov, A., Abawajy, J., Buyya, R.: Energy-aware resource allocation heuristics for efficient management of data centers for Cloud computing. Elsevier: Future Gener. Comput. Syst. 28, 755–768 (2012)
Google Scholar
Gayathiri, N., Sudha Sadasivam, G.: Evaluation of hadoop on local clusters and Amazon public cloud. In: International Conference on Cloud and Big Data Analytics, India (2013)
Google Scholar
Yong Sheng Gong: OpenStack nova-scheduler and its algorithm. In: IBM Developers work (2012)
Google Scholar
Pepple, K.: Deploying openstack. Oreilly, (2011)
Google Scholar
Power and Performance data sheet. http://www.dell.com/downloads/global/products/pedge/en/PowerEdge_R210_250W_Energy_Star_DataSheet.pdf (20 Nov 2013)

Download references

Acknowledgments

We would like to express our sincere gratitude to Dr. R. Rudramoorthy, Principal, PSG College of Technology, for providing us with the necessary facilities for the work. We would also like to thank Dr. R. Venkatesan, HOD, Dept. of Computer Science and Engineering, for all the support provided to accomplish the work. The authors thank Mr. Chidambaran Kollengode, Director, Nokia R&D. This project is a PSG-Nokia Collaborative research work.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, PSG College of Technology, Coimbatore, TN, India
D. Raghuram, N. Gayathri & G. Sudha Sadasivam

Authors

D. Raghuram
View author publications
You can also search for this author in PubMed Google Scholar
N. Gayathri
View author publications
You can also search for this author in PubMed Google Scholar
G. Sudha Sadasivam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to N. Gayathri .

Editor information

Editors and Affiliations

School of Computer Science and Technolog, Karunya University, Coimbatore, Tamil Nadu, India
Elijah Blessing Rajsingh
School of Computing, National University of Singapore, Singapore, Singapore
Anand Bhojan
Department of Information Technology, Karunya University, Coimbatore, Tamil Nadu, India
J. Dinesh Peter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Raghuram, D., Gayathri, N., Sudha Sadasivam, G. (2015). Efficiently Scheduling Hadoop Cluster in Cloud Environment. In: Rajsingh, E., Bhojan, A., Peter, J. (eds) Informatics and Communication Technologies for Societal Development. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1916-3_10

Download citation

DOI: https://doi.org/10.1007/978-81-322-1916-3_10
Published: 04 July 2014
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-1915-6
Online ISBN: 978-81-322-1916-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics