Skip to main content

A Literature Review on Hadoop Ecosystem and Various Techniques of Big Data Optimization

  • Conference paper
  • First Online:
Advances in Data and Information Sciences

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 38))

Abstract

We are living in twenty-first century, and this century means for its faster work, accurate analysis, highly processed data, and speed. This is the epoch of “Big data.” Big data is a term that describes huge mass of structured and unstructured data that is unable to be processed by traditional data processing systems. Big data stands for storage of large amount of data to extract the valuable content with its characteristics 5-Vs, i.e., Volume, Variety, Velocity, Veracity, and Value. But before the arrival of Hadoop, procuring and depository of data was an issue. Hadoop takes its first step in the Data Science Market in 2005. It was created by Doug Cutting and Mike Cafarella. Hadoop is a software framework that allows users to depot data and run their applications on Hadoop clusters. Its best part is its open-source framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bagriyanik S, Karahoca A (2016) Big Data in software engineering: a systematic literature review. Glob J Inf Technol 6(1):107–116

    Google Scholar 

  2. Tsai CW, Lai CF, Chao1 HC, Vasilakos AV (2015) Big Data analytics: a survey, of Big Data 2:21. https://doi.org/10.1186/s40537-015-0030-3

  3. Saltz JS, Shamshurin I (2016) Big Data team process methodologies: a literature review and the identification of key factors for a project’s success. In: 2016 IEEE International Conference on Big Data (Big Data)

    Google Scholar 

  4. Nelson B, Olovsson T Security and privacy for Big Data: a systematic literature review. In: 2016 IEEE International Conference on Big Data (Big Data)

    Google Scholar 

  5. Kumari S A review paper on Big Data and Hadoop. Int J Recent Adv Eng Technol (IJRAET) 4(1):2347–2812 (For National Conference on Recent Innovations in Science, Technology & Management (NCRISTM) ISSN (Online))

    Google Scholar 

  6. Ularu EG, Puican FC, Apostu A, Velicanu M (2012) Perspectives on Big Data and Big Data analytics. Database Sys J III(4)

    Google Scholar 

  7. Anjali PP, Binu A (2014) A comparative survey based on processing network traffic data using Hadoop Pig and typical map-reduce. Int J Comput Sci Eng Surv (IJCSES) 5(1)

    Google Scholar 

  8. Assunção MD, Calheiros RN, Bianchi S, Netto MA, Buyya R (2015) Big Data computing and clouds: trends and future directions. J Parallel Distrib Comput 79–80:3–15 (Elsevier)

    Article  Google Scholar 

  9. Mukherjee S, Shaw R Big Data—concepts, applications, challenges and future scope. Int J Adv Res Comput Commun Eng 5(2)

    Google Scholar 

  10. Sreedhar C, Kasiviswanath N, Reddy PC (2017) Clustering large datasets using K-means modified inter and intra clustering (KMI2C) in Hadoop. J Big Data (Springer)

    Google Scholar 

  11. Taylor R (2010) An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics Author. In: Pacific Northwest National Laboratory Bioinformatics Open Source Conference 2010 Richland, WA

    Article  Google Scholar 

  12. Lu H, Hai-Shan C, Ting-Ting H (2012) Research on Hadoop cloud computing model and its applications. In: 2012 third international conference on networking and distributed computing

    Google Scholar 

  13. Dhavapriya M, Yasodha N (2016) Big data analytics: challenges and solutions using Hadoop, map reduce and big table. Int J Comput Sci Trends Technol (IJCST) 4(1) Jan–Feb 2016

    Google Scholar 

  14. Wang L, Taoc J, Ranjan R, Marten H, Streit A, Chene J, Chena D (2013) G-Hadoop: MapReduce across distributed data centers for data-intensive computing. Future Gener Comput Sys 29:739–750, Elsevier

    Article  Google Scholar 

  15. Dean J, Ghemawat S (2004) MapReduce: simplifed data processing on large clusters. research.google.com/archive/mapreduce

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manish Taram .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Singh, V.K., Taram, M., Agrawal, V., Baghel, B.S. (2018). A Literature Review on Hadoop Ecosystem and Various Techniques of Big Data Optimization. In: Kolhe, M., Trivedi, M., Tiwari, S., Singh, V. (eds) Advances in Data and Information Sciences. Lecture Notes in Networks and Systems, vol 38. Springer, Singapore. https://doi.org/10.1007/978-981-10-8360-0_22

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-8360-0_22

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-8359-4

  • Online ISBN: 978-981-10-8360-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics