A Proposal for Optimization of Horizontal Scaling in Big Data Environment

  • Chandrima RoyEmail author
  • Manjusha Pandey
  • Siddharth Swarup Rautaray
Conference paper
Part of the Lecture Notes in Networks and Systems book series (LNNS, volume 38)


The data which is beyond the storage space of the server and beyond to the processing power is called Big Data. It is not manageable by traditional RDBMS or conventional statistical tools. Big data increases the storage capacities as well as the processing power. Horizontal scaling or sharding is needed to divide the data set and distributes the data over multiple servers. Redundancy and fault tolerance are achieved by horizontal scaling. Optimization of horizontal scaling is an important aspect of Big Data technology. Instead of using vertical scaling that means upgrading to fancier computers when the current system becomes inadequate, we have to add more node (computers) to a cluster. It increases the parallelism, rather than the performance of any one node. This paper presents the fundamentals of big data analytics but directing toward an analysis of various optimization techniques used in the big data environment.


Optimization Horizontal scaling RDBMS Hadoop Big data tools 


  1. 1.
    Feng D, Zhu L, Zhang L (2016) Review of hadoop performance optimization. In: 2016 2nd IEEE international conference on computer and communications (ICCC). IEEEGoogle Scholar
  2. 2.
    Westwood JA, Cazier JA (2016) Work-Life optimization: using big data and analytics to facilitate work-life balance. In: 2016 49th Hawaii international conference on system sciences (HICSS). IEEEGoogle Scholar
  3. 3.
    Lee CL, Su WS, Tang KA (2014) Design of handover self-optimization using big data analytics. In: 2014 16th Asia-Pacific network operations and management symposium (APNOMS). IEEEGoogle Scholar
  4. 4.
    Robak S, Franczyk B, Robak M (2012) “Applying linked data concepts” in BPM. In: FedCIS 2012, IT4L. IEEE Conference Publications, pp 1105–1110Google Scholar
  5. 5.
    Yu W, Li J, Bhuiyan MZA, Zhang R, Huai J (2017) Ring: real-time emerging anomaly monitoring system over text streams. IEEE Trans Big Data PP(99):1Google Scholar
  6. 6.
    McCreadie R, Macdonald C, Ounis I, Osborne M, Petrovic S (2013) Scalable distributed event detection for twitter. In: 2013 international conference on IEEE big dataGoogle Scholar
  7. 7.
    Yang HC, Dasdan A, Hsiao RL (2007) Map-reduce-merge: simplified relational data processing on large c1usters. Sigmod 1029–1040Google Scholar
  8. 8.
    Banerjee A, Bandyopadhyay T, Acharya P (2013) Data analytics: Hped up aspirations or true potential? Vikalpa 38(4):1–11CrossRefGoogle Scholar
  9. 9.
    Boyd D, Crawford K (2012) Critical questions for big data. Inform Commun Soc 15(5):662–679CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  • Chandrima Roy
    • 1
    Email author
  • Manjusha Pandey
    • 1
  • Siddharth Swarup Rautaray
    • 1
  1. 1.School of Computer EngineeringKIIT UniversityBhubaneswarIndia

Personalised recommendations