Horizontal Scaling Enhancement for Optimized Big Data Processing

  • Chandrima RoyEmail author
  • Kashyap Barua
  • Sandeep Agarwal
  • Manjusha Pandey
  • Siddharth Swarup Rautaray
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 755)


Big Data, as we all know, is becoming a new technological trend in the industries, in science and even businesses. Indefinite data scalability allows organizations to process huge amounts of data in parallel, assisting dramatically decrease the amount of time it takes to manage several amount of work, optimize hardware resource usage and permit the extreme quantity of data per node to be handled. Optimization is to done to attain the finest strategy relative to a set of selected constraints which include maximizing factors such as efficiency, productivity, reliability, strength, and utilization. When the current system becomes insufficient, instead of upgrading it by adding more components to the existing structure you just add more computers to a cluster. This research discusses a hierarchical architecture of Hadoop Nodes namely Name nodes and Data nodes and mainly focuses on the optimization of Data Node by distributing some of its work load to Name Node.


Big data Hadoop Optimization Scalability Horizontal scaling 


  1. 1.
    Yadav, K., Pandey, M., Rautaray, S.S.: Feedback analysis using big data tools. In: International Conference on ICT in Business Industry & Government (ICTBIG). IEEE (2016)Google Scholar
  2. 2.
    Chakraborty, S. et al.: A proposal for high availability of HDFS architecture based on threshold limit and saturation limit of the namenode (2017)Google Scholar
  3. 3.
    Jena, B. et al.: Name node performance enlarging by aggregator based HADOOP framework. In: 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC). IEEE (2017)Google Scholar
  4. 4.
    Shvachko, K., et al.: The hadoop distributed file system. In: 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST). IEEE (2010)Google Scholar
  5. 5.
    Jahani, Eaman, Cafarella, Michael J., Ré, Christopher: Automatic optimization for MapReduce programs. Proc. VLDB Endow. 4(6), 385–396 (2011)CrossRefGoogle Scholar
  6. 6.
    Lee, K.-H. et al.: Parallel data processing with MapReduce: a survey. ACM sIGMoD Record 40(4), 11–20 (2012)Google Scholar
  7. 7.
    White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Inc. (2012)Google Scholar
  8. 8.
    Kanaujia, P.K.M., Pandey, M., Rautaray, S.S.: Real time financial analysis using big data technologies. In: 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC). IEEE (2017)Google Scholar
  9. 9.
    Borthakur, Dhruba: The hadoop distributed file system: architecture and design. Hadoop Proj. Website 11(2007), 21 (2007)Google Scholar
  10. 10.
    Jena, B. et al.: A survey work on optimization techniques utilizing map reduce framework. Hadoop Cluster. Int. J. Intell. Syst. Appl. 9(4), 61 (2017)Google Scholar
  11. 11.
    Feng, D., Zhu, L., Zhang, L.: Review of hadoop performance optimization. In: 2016 2nd IEEE International Conference on Computer and Communications (ICCC). IEEE (2016)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Chandrima Roy
    • 1
    Email author
  • Kashyap Barua
    • 1
  • Sandeep Agarwal
    • 1
  • Manjusha Pandey
    • 1
  • Siddharth Swarup Rautaray
    • 1
  1. 1.School of Computer EngineeringKIIT, Deemed to be UniversityBhubaneswarIndia

Personalised recommendations