Abstract
Today, most of the real-time applications like bioinformatics and image processing involve processing of large amounts of unstructured data that requires fast, memory-consuming, and highly efficient resources. This problem has been resolved by the introduction of cloud, which is now the most favored option for big-data analytics. Hadoop, a framework for manipulating unstructured data, is used for this purpose. The nodes that form the Hadoop cluster are scheduled randomly in Amazon cloud. Since huge amounts of data need to be transferred among these nodes, the time taken to upload and process the data is quite high, thereby decreasing the performance. The further focus of service providers is on maximizing resource utilization and minimizing power consumption. This chapter aims at designing an energy-efficient scheduler for a cloud environment that will be suitable for the big-data applications. The working of the scheduler has been tested in OpenStack cloud environment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cloud computing. en.wikipedia.org/wiki/Cloud_computing (25 Aug 2013)
Introduction to OpenStack and its components. http://docs.openstack.org/ (20 Sept 2013)
Tom, W.: Hadoop: The Definitive Guide. O’Reilly Media, Beijing (2012)
Pei Fan, Ji Wang, Zibin Zheng, Lyu, M.R.: Toward optimal deployment of communication-intensive cloud applications. In: IEEE International Conference on Cloud Computing, pp. 460–467. (2011)
Gihun Jung, Kwang Mong Sim: Location-aware dynamic resource allocation model for Cloud computing environment. In: International Conference on Information and Computer Applications, Hong Kong, pp. 37–41. (2012)
Beloglazov, A., Buyya, R., Lee, Y.C., Zomaya, A.: A taxonomy and survey of energy-efficient data centers and Cloud computing systems. Technical Report, Cloud Computing and Distributed Systems Laboratory, (2010)
Ching-Hsien Hsu, Slagter, K.D., Shih-Chang Chen, Yeh-Ching Chung: Optimizing energy consumption with task consolidation in clouds. Elsevier: Inf. Sci. 258, 452–462 (2014)
Beloglazov, A., Abawajy, J., Buyya, R.: Energy-aware resource allocation heuristics for efficient management of data centers for Cloud computing. Elsevier: Future Gener. Comput. Syst. 28, 755–768 (2012)
Gayathiri, N., Sudha Sadasivam, G.: Evaluation of hadoop on local clusters and Amazon public cloud. In: International Conference on Cloud and Big Data Analytics, India (2013)
Yong Sheng Gong: OpenStack nova-scheduler and its algorithm. In: IBM Developers work (2012)
Pepple, K.: Deploying openstack. Oreilly, (2011)
Power and Performance data sheet. http://www.dell.com/downloads/global/products/pedge/en/PowerEdge_R210_250W_Energy_Star_DataSheet.pdf (20 Nov 2013)
Acknowledgments
We would like to express our sincere gratitude to Dr. R. Rudramoorthy, Principal, PSG College of Technology, for providing us with the necessary facilities for the work. We would also like to thank Dr. R. Venkatesan, HOD, Dept. of Computer Science and Engineering, for all the support provided to accomplish the work. The authors thank Mr. Chidambaran Kollengode, Director, Nokia R&D. This project is a PSG-Nokia Collaborative research work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer India
About this paper
Cite this paper
Raghuram, D., Gayathri, N., Sudha Sadasivam, G. (2015). Efficiently Scheduling Hadoop Cluster in Cloud Environment. In: Rajsingh, E., Bhojan, A., Peter, J. (eds) Informatics and Communication Technologies for Societal Development. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1916-3_10
Download citation
DOI: https://doi.org/10.1007/978-81-322-1916-3_10
Published:
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-1915-6
Online ISBN: 978-81-322-1916-3
eBook Packages: EngineeringEngineering (R0)