A Literature Review on Hadoop Ecosystem and Various Techniques of Big Data Optimization

Singh, Vikash Kumar; Taram, Manish; Agrawal, Vinni; Baghel, Bhartee Singh

doi:10.1007/978-981-10-8360-0_22

Vikash Kumar Singh⁶,
Manish Taram⁶,
Vinni Agrawal⁶ &
…
Bhartee Singh Baghel⁶

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 38))

969 Accesses
10 Citations

Abstract

We are living in twenty-first century, and this century means for its faster work, accurate analysis, highly processed data, and speed. This is the epoch of “Big data.” Big data is a term that describes huge mass of structured and unstructured data that is unable to be processed by traditional data processing systems. Big data stands for storage of large amount of data to extract the valuable content with its characteristics 5-Vs, i.e., Volume, Variety, Velocity, Veracity, and Value. But before the arrival of Hadoop, procuring and depository of data was an issue. Hadoop takes its first step in the Data Science Market in 2005. It was created by Doug Cutting and Mike Cafarella. Hadoop is a software framework that allows users to depot data and run their applications on Hadoop clusters. Its best part is its open-source framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bagriyanik S, Karahoca A (2016) Big Data in software engineering: a systematic literature review. Glob J Inf Technol 6(1):107–116
Google Scholar
Tsai CW, Lai CF, Chao1 HC, Vasilakos AV (2015) Big Data analytics: a survey, of Big Data 2:21. https://doi.org/10.1186/s40537-015-0030-3
Saltz JS, Shamshurin I (2016) Big Data team process methodologies: a literature review and the identification of key factors for a project’s success. In: 2016 IEEE International Conference on Big Data (Big Data)
Google Scholar
Nelson B, Olovsson T Security and privacy for Big Data: a systematic literature review. In: 2016 IEEE International Conference on Big Data (Big Data)
Google Scholar
Kumari S A review paper on Big Data and Hadoop. Int J Recent Adv Eng Technol (IJRAET) 4(1):2347–2812 (For National Conference on Recent Innovations in Science, Technology & Management (NCRISTM) ISSN (Online))
Google Scholar
Ularu EG, Puican FC, Apostu A, Velicanu M (2012) Perspectives on Big Data and Big Data analytics. Database Sys J III(4)
Google Scholar
Anjali PP, Binu A (2014) A comparative survey based on processing network traffic data using Hadoop Pig and typical map-reduce. Int J Comput Sci Eng Surv (IJCSES) 5(1)
Google Scholar
Assunção MD, Calheiros RN, Bianchi S, Netto MA, Buyya R (2015) Big Data computing and clouds: trends and future directions. J Parallel Distrib Comput 79–80:3–15 (Elsevier)
Article Google Scholar
Mukherjee S, Shaw R Big Data—concepts, applications, challenges and future scope. Int J Adv Res Comput Commun Eng 5(2)
Google Scholar
Sreedhar C, Kasiviswanath N, Reddy PC (2017) Clustering large datasets using K-means modified inter and intra clustering (KMI2C) in Hadoop. J Big Data (Springer)
Google Scholar
Taylor R (2010) An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics Author. In: Pacific Northwest National Laboratory Bioinformatics Open Source Conference 2010 Richland, WA
Article Google Scholar
Lu H, Hai-Shan C, Ting-Ting H (2012) Research on Hadoop cloud computing model and its applications. In: 2012 third international conference on networking and distributed computing
Google Scholar
Dhavapriya M, Yasodha N (2016) Big data analytics: challenges and solutions using Hadoop, map reduce and big table. Int J Comput Sci Trends Technol (IJCST) 4(1) Jan–Feb 2016
Google Scholar
Wang L, Taoc J, Ranjan R, Marten H, Streit A, Chene J, Chena D (2013) G-Hadoop: MapReduce across distributed data centers for data-intensive computing. Future Gener Comput Sys 29:739–750, Elsevier
Article Google Scholar
Dean J, Ghemawat S (2004) MapReduce: simplifed data processing on large clusters. research.google.com/archive/mapreduce
Google Scholar

Download references

Author information

Authors and Affiliations

IGNTU, Amarkantak, India
Vikash Kumar Singh, Manish Taram, Vinni Agrawal & Bhartee Singh Baghel

Authors

Vikash Kumar Singh
View author publications
You can also search for this author in PubMed Google Scholar
Manish Taram
View author publications
You can also search for this author in PubMed Google Scholar
Vinni Agrawal
View author publications
You can also search for this author in PubMed Google Scholar
Bhartee Singh Baghel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manish Taram .

Editor information

Editors and Affiliations

Smart Grid and Renewable Energy, University of Agder, Kristiansand, Norway
Mohan L. Kolhe
Department of Computer Science and Engineering, ABES Engineering College, Ghaziabad, Uttar Pradesh, India
Munesh C. Trivedi
Department of Computer Science and Engineering, ABES Engineering College, Ghaziabad, Uttar Pradesh, India
Shailesh Tiwari
Department of Computer Science and Engineering, The Indira Gandhi National Tribal University, Amarkantak, Madhya Pradesh, India
Vikash Kumar Singh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Singh, V.K., Taram, M., Agrawal, V., Baghel, B.S. (2018). A Literature Review on Hadoop Ecosystem and Various Techniques of Big Data Optimization. In: Kolhe, M., Trivedi, M., Tiwari, S., Singh, V. (eds) Advances in Data and Information Sciences. Lecture Notes in Networks and Systems, vol 38. Springer, Singapore. https://doi.org/10.1007/978-981-10-8360-0_22

Download citation

DOI: https://doi.org/10.1007/978-981-10-8360-0_22
Published: 08 April 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8359-4
Online ISBN: 978-981-10-8360-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics