Advertisement

Forest of Distributed B+Tree Based on Key-Value Store for Big-Set Problem

  • Thanh Trung NguyenEmail author
  • Minh Hieu Nguyen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9645)

Abstract

In many big-data systems, the amount of data is growing rapidly. Many systems have to store big-sets: the sets with a large number of items. Efficiently storing a large number of big-sets to support high rate updating and querying is a challenging problem in data storage systems. Nowadays, distributed key-value stores play important roles in building large-scale systems with many advantages. They support horizontal scalability, low-latency, high throughput when manipulating small or medium key-value pairs. Unfortunately, when working with big-set data structure, they do not work well and most of them are not scalable with a large number of big sets. In this research, we analyze the difficulty in storing big-sets using key-value stores. An architecture called “Forest of distributed \(B^{+}Tree\) and algorithms are proposed to build NoSql data store for storing big data structures such as set, dictionary. The big-sets are split into multiple small sets of limited size and stored in key-value stores. A Multi-level meta-data is also proposed and used to reduce the complexity in writing operations of big-sets when using key-value stores from O(N) to O(log(N)). This research can store larger number of items in a set than Cassandra and Google BigTable. Parts of big set in this research is distributed while a row in Google BigTable only has a limited size and must be fit in a server. Experiment results show that proposed system has better read performance than Cassandra. The proposed architecture may potentially be used in various applications such as storage system for data from sensors in the Internet of Things (IoT) systems, commercial transaction storages and social networks.

Keywords

Big set Forest of distributed B+Tree Key-value Big data structure Storage 

Notes

Acknowledgment

This research is funded by Research and Development Department of VNG.

References

  1. 1.
    Aguilera, M.K., Golab, W., Shah, M.A.: A practical scalable distributed B-tree. Proc. VLDB Endowment 1(1), 598–609 (2008)CrossRefGoogle Scholar
  2. 2.
    Burrows, M.: The chubby lock service for loosely-coupled distributed systems. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation, pp. 335–350. USENIX Association (2006)Google Scholar
  3. 3.
    Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. (TOCS) 26(2), 4 (2008)CrossRefGoogle Scholar
  4. 4.
    Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp. 143–154. ACM (2010)Google Scholar
  5. 5.
    Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. ACM SIGOPS Oper. Syst. Rev. 37, 29–43 (2003). ACMCrossRefGoogle Scholar
  6. 6.
    Hunt, P., Konar, M., Junqueira, F.P., Reed, B.: ZooKeeper: wait-free coordination for internet-scale systems. In: Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference, vol. 8, p. 11 (2010)Google Scholar
  7. 7.
    Google Inc.: LevelDB - A fast and lightweight key/value database library by Google (2013). http://code.google.com/p/leveldb. Accessed on 23 July 2013
  8. 8.
    FAL Labs: Kyoto Cabinet: a straightforward implementation of DBM (2013). http://fallabs.com/kyotocabinet. Accessed on 1 May 2013
  9. 9.
    Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. ACM SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010)CrossRefGoogle Scholar
  10. 10.
    Lim, H., Fan, B., Andersen, D.G., Kaminsky, M.: SILT: a memory-efficient, high-performance key-value store. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, pp. 1–13. ACM (2011)Google Scholar
  11. 11.
    Litwin, W., Neimat, M.-A., Schneider, D.: RP*: a family of order preserving scalable distributed data structures. VLDB 94, 12–15 (1994)Google Scholar
  12. 12.
    Mao, Y., Kohler, E., Morris, R.T.: Cache craftiness for fast multicore key-value storage. In: Proceedings of the 7th ACM European Conference on Computer Systems, pp. 183–196. ACM (2012)Google Scholar
  13. 13.
    Megiddo, N., Modha, D.S.: ARC: a self-tuning, low overhead replacement cache. In: FAST, vol. 3, pp. 115–130 (2003)Google Scholar
  14. 14.
    Megiddo, N., Modha, D.S.: Outperforming LRU with an adaptive replacement cache algorithm. Computer 37(4), 58–65 (2004)CrossRefGoogle Scholar
  15. 15.
    Nguyen, T., Nguyen, M.: Zing Database: high-performance key-value store for large-scale storage service. Vietnam J. Comput. Sci. 2(1), 13–23 (2015)CrossRefGoogle Scholar
  16. 16.
    Nguyen, T.T., Nguyen, A.T., Nguyen, T.A.H., Vu, L.T., Nguyen, Q.U., Hai, L.D.: Unsupervised anomaly detection in online game. In: Proceedings of the Sixth International Symposium on Information and Communication Technology, SoICT 2015, pp. 4–10. ACM, New York (2015)Google Scholar
  17. 17.
    O’neil, E.J., O’neil, P.E., Weikum, G.: The LRU-K page replacement algorithm for database disk buffering. ACM SIGMOD Rec. 22(2), 297–306 (1993)CrossRefGoogle Scholar
  18. 18.
    O’neil, E.J., O’Neil, P.E., Weikum, G.: An optimality proof of the LRU-K page replacement algorithm. J. ACM (JACM) 46(1), 92–112 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Oracle: Oracle Berkeley DB 12c: Persistent key value store (2013). http://www.oracle.com/technetwork/products/berkeleydb
  20. 20.
    Sanfilippo, S., Noordhuis, P.: Redis. http://redis.io. Accessed on 07 June 2013
  21. 21.
    Sowell, B., Golab, W., Shah, M.A.: Minuet: a scalable distributed multiversion B-tree. Proc. VLDB Endowment 5(9), 884–895 (2012)CrossRefGoogle Scholar
  22. 22.
    Zhang, K., Wang, K., Yuan, Y., Guo, L., Lee, R., Zhang, X.: Mega-KV: a case for GPUs to maximize the throughput of in-memory key-value stores. Proc. VLDB Endowment 8(11), 1226–1237 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Research and Development DepartmentVNG CorporationHanoiVietnam
  2. 2.Le Quy Don Technical UniversityHanoiVietnam

Personalised recommendations