An Approach to Improve Load Balancing in Distributed Storage Systems for NoSQL Databases: MongoDB
The ongoing process of heterogeneous data generation needs a better NoSQL database system to accommodate it. NoSQL database stores data in the distributed manner in their globally deployed shards. The data stored in these databases should have high availability, and the system should not compromise with the scalability and partition tolerance. The distributed storage systems have the main challenge to address the skewness in the data. The process of distribution of data items over the nodes in the system causes skewness of data. To address this problem, we propose a different approach to balance load in the distributed environment is the partitioning of data into small chunks that can be relocated independently.
KeywordsNoSQL Data load balancing MongoDB Chunk migration Big data
- 1.Church, George M., Yuan Gao, and Sriram Kosuri, 2012, “Next-generation digital information storage in DNA.” Science 337.6102: 1628–1628.Google Scholar
- 2.Dean, Jeffrey, and Sanjay Ghemawat, 2008, “MapReduce: simplified data processing on large clusters.” Communications of the ACM 51.1: 107–113.Google Scholar
- 3.Lakshman, Avinash, and Prashant Malik, 2010, “Cassandra: a decentralized structured storage system.” ACM SIGOPS Operating Systems Review 44.2: 35–40.Google Scholar
- 4.M. Ali-ud-din, et al., 2014, “Seven V’s of Big Data understanding Big Data to extract value,” American Society for Engineering Education (ASEE Zone 1), 2014 Zone 1 Conference of the, Bridgeport, CT, USA.Google Scholar
- 5.E. Dumbill, 2012, “What is big data?,” O’Reilly Media, Inc., Available: https://beta.oreilly.com/ideas/what-is-big-data.
- 6.DeCandia, Giuseppe, et al., 2007, “Dynamo: amazon’s highly available key-value store.” ACM SIGOPS operating systems review 41.6: 205–220.Google Scholar
- 7.Cooper, Brian F., et al., 2008, “PNUTS: Yahoo!’s hosted data serving platform.” Proc. of the VLDB Endowment 1: 1277–1288.Google Scholar
- 8.Chang, Fay, et al., 2008, “Bigtable: A distributed storage system for structured data.” ACM Trans. on Computer Systems (TOCS) 26.2: 4.Google Scholar
- 9.“MongoDB,” MongoDB Inc., 2015, Available: https://en.wikipedia.org/wiki/MongoDB.
- 10.E. A. Brewer, Towards robust distributed systems. (Invited Talk), Oregon, 2000.Google Scholar
- 11.Featherston, Dietrich, 2010, “cassandra: Principles and Application.” Department of Computer Science University of Illinois at Urbana-Champaign.Google Scholar
- 12.Thusoo, Ashish, et al., 2010, “Data warehousing and analytics infrastructure at facebook.” Proc. of the 2010 ACM SIGMOD Inter. Conf. on Management of data.Google Scholar
- 13.Glendenning, Lisa, et al. “Scalable consistency in Scatter, 2011,” Proc. of the Twenty-Third ACM Symposium on Operating Systems Principles.Google Scholar
- 14.MongoDB Documentation,” 25 June 2015. [Online].Google Scholar
- 15.Liu, Yimeng, Yizhi Wang, and Yi Jin., 2012, “Research on the improvement of MongoDB Auto-Sharding in cloud environment.” Computer Science & Education (ICCSE), 2012 7th Inter. Conf. on. IEEE.Google Scholar
- 16.Gifford, David K, 1979, “Weighted voting for replicated data.” Proc. of the seventh ACM symposium on Operating systems principles.Google Scholar
- 17.Lamport, Leslie, 1998, “The part-time parliament.” ACM Transactions on Computer Systems (TOCS) 16.2: 133–169.Google Scholar
- 18.Godfrey, Brighten, et al., 2004, “Load balancing in dynamic structured P2P systems.” INFOCOM 2004. Twenty-third Annual Joint Conf. of the IEEE Computer and Communications Societies. Vol. 4.Google Scholar