Skip to main content

Efficiently Scheduling Hadoop Cluster in Cloud Environment

  • Conference paper
  • First Online:
Informatics and Communication Technologies for Societal Development

Abstract

Today, most of the real-time applications like bioinformatics and image processing involve processing of large amounts of unstructured data that requires fast, memory-consuming, and highly efficient resources. This problem has been resolved by the introduction of cloud, which is now the most favored option for big-data analytics. Hadoop, a framework for manipulating unstructured data, is used for this purpose. The nodes that form the Hadoop cluster are scheduled randomly in Amazon cloud. Since huge amounts of data need to be transferred among these nodes, the time taken to upload and process the data is quite high, thereby decreasing the performance. The further focus of service providers is on maximizing resource utilization and minimizing power consumption. This chapter aims at designing an energy-efficient scheduler for a cloud environment that will be suitable for the big-data applications. The working of the scheduler has been tested in OpenStack cloud environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cloud computing. en.wikipedia.org/wiki/Cloud_computing (25 Aug 2013)

  2. Introduction to OpenStack and its components. http://docs.openstack.org/ (20 Sept 2013)

  3. Tom, W.: Hadoop: The Definitive Guide. O’Reilly Media, Beijing (2012)

    Google Scholar 

  4. Pei Fan, Ji Wang, Zibin Zheng, Lyu, M.R.: Toward optimal deployment of communication-intensive cloud applications. In: IEEE International Conference on Cloud Computing, pp. 460–467. (2011)

    Google Scholar 

  5. Gihun Jung, Kwang Mong Sim: Location-aware dynamic resource allocation model for Cloud computing environment. In: International Conference on Information and Computer Applications, Hong Kong, pp. 37–41. (2012)

    Google Scholar 

  6. Beloglazov, A., Buyya, R., Lee, Y.C., Zomaya, A.: A taxonomy and survey of energy-efficient data centers and Cloud computing systems. Technical Report, Cloud Computing and Distributed Systems Laboratory, (2010)

    Google Scholar 

  7. Ching-Hsien Hsu, Slagter, K.D., Shih-Chang Chen, Yeh-Ching Chung: Optimizing energy consumption with task consolidation in clouds. Elsevier: Inf. Sci. 258, 452–462 (2014)

    Google Scholar 

  8. Beloglazov, A., Abawajy, J., Buyya, R.: Energy-aware resource allocation heuristics for efficient management of data centers for Cloud computing. Elsevier: Future Gener. Comput. Syst. 28, 755–768 (2012)

    Google Scholar 

  9. Gayathiri, N., Sudha Sadasivam, G.: Evaluation of hadoop on local clusters and Amazon public cloud. In: International Conference on Cloud and Big Data Analytics, India (2013)

    Google Scholar 

  10. Yong Sheng Gong: OpenStack nova-scheduler and its algorithm. In: IBM Developers work (2012)

    Google Scholar 

  11. Pepple, K.: Deploying openstack. Oreilly, (2011)

    Google Scholar 

  12. Power and Performance data sheet. http://www.dell.com/downloads/global/products/pedge/en/PowerEdge_R210_250W_Energy_Star_DataSheet.pdf (20 Nov 2013)

Download references

Acknowledgments

We would like to express our sincere gratitude to Dr. R. Rudramoorthy, Principal, PSG College of Technology, for providing us with the necessary facilities for the work. We would also like to thank Dr. R. Venkatesan, HOD, Dept. of Computer Science and Engineering, for all the support provided to accomplish the work. The authors thank Mr. Chidambaran Kollengode, Director, Nokia R&D. This project is a PSG-Nokia Collaborative research work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to N. Gayathri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer India

About this paper

Cite this paper

Raghuram, D., Gayathri, N., Sudha Sadasivam, G. (2015). Efficiently Scheduling Hadoop Cluster in Cloud Environment. In: Rajsingh, E., Bhojan, A., Peter, J. (eds) Informatics and Communication Technologies for Societal Development. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1916-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-1916-3_10

  • Published:

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-1915-6

  • Online ISBN: 978-81-322-1916-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics