Cluster Computing

, Volume 22, Supplement 1, pp 1011–1021 | Cite as

B+-tree construction on massive data with Hadoop

  • Huynh Cong Viet Ngu
  • Jun-Ho HuhEmail author


The data processing in the Socialist Republic of Vietnam (Vietnam, hereunder) is in an early stage and a variety of problems are needed to be solved. In the Vietnamese banking and financial sectors, where managing and storing of customer data and transaction histories are being emphasized as never before, the volume of data to be secured on a daily basis are explosively increasing due to rapid economic development so that the relevant authorities are seeking an efficient and reliable way to manage them. Being a widely known popular variation of B-tree, B+-tree is considered as a most adequate tree-type data structure for bulk data. Nevertheless, as it is quite time-consuming to construct a B+-tree for massive data the authors propose a Hadoop framework-based parallel B+-tree system to deal with the problem. The system is largely divided into three phases: First, data are partitioned and distributed evenly such that each partition will have almost the same amount of data volume. Second, a parallel local B+-tree system is constructed. Finally, some small-scale B+-trees are constructed and integrated into the complete form of B+-tree which will be dealing with an entire data set. The authors expect that the proposed system will offer an efficient index structuring while reducing data processing time.


B-tree B+-tree Hadoop Map-Reduce Big Data Cloud Computing 



The part of this paper [14] was presented International Conference on Information Science and Applications (ICISA 2017), March 20th–23th at MACAU. I am grateful to two anonymous commentators who have contributed to the enhancement of the paper’s completeness with their valuable suggestions at the Conference.


  1. 1.
    Douglas, C.: The ubiquitous B-tree. Comput. Surv. ACM. 11(2), 121–137 (1979)CrossRefzbMATHGoogle Scholar
  2. 2.
    Cong, V.N.H., et al.: Improving the quality of an R-tree using the Map-Reduce framework. Advanced Multimedia and Ubiquitous Engineering, (CUTE 2016), vol. 448, pp. 164–170. Springer, Singapore (2017)Google Scholar
  3. 3.
    Cong, V.N.H: Enhanced R-tree bulk loading scheme using Map-Reduce framework. M.S. Thesis of Department of IT Convergence and Application Engineering, pp. 4–22. The Graduate School, Pukyong National University, Republic of Korea (2017)Google Scholar
  4. 4.
    Leutenegger, S.T., Edgington, J.M., Lopez, M.A.: STR: a simple and efficient algorithm for R-tree packing.In: IEEE 13th International Conference on Data Engineering, pp. 497–506 (1997)Google Scholar
  5. 5.
    Kajioka, S., Mori, T., Uchiya, T., Takumi, I., Matsuo, H.: Experiment of indoor position presumption based on RSSI of Bluetooth LE beacon, In: 2014 IEEE 3rd Global Conference on Consumer Electronics (GCCE), pp. 337–339. IEEE (2014)Google Scholar
  6. 6.
    Huh, J.-H., Je, S.-M., Seo, K.: Design and configuration of avoidance technique for worst situation in zigbee communications using OPNET. Information Science and Applications (ICISA). LNEE, vol. 376, pp. 331–336. Springer, Heidelberg (2016)Google Scholar
  7. 7.
    Birkenmeie, G.F., Park, J.-K., Rizvi, S.T.: Principally quasi-Baer ring hulls. In: Van Huynh, D., López-Permouth, S.R. (eds.) Advances in Ring Theory. Trends in Mathematics, pp. 47–61. Springer, Basel (2010)CrossRefGoogle Scholar
  8. 8.
    Birkenmeier, G.F., Park, J.-K., Rizvi, S.T.: Ring hulls of semiprime homomorphic images. In: Brzeziński, T., Gómez Pardo, J.L., Shestakov, I., Smith, P.F. (eds.) Modules and Comodules. Trends in Mathematics, pp. 101–111. Springer, Basel (2008)CrossRefGoogle Scholar
  9. 9.
    Apache Hadoop:
  10. 10.
    Prasad, S.K., McDermott, M., He, X.: GPGPU-based parallel R-tree construction and querying. In: 2015 IEEE International Conference (IPDPSW), pp. 619–627 (2015)Google Scholar
  11. 11.
    Sung, Y., Jeong, Y.-S., Park, J.-H.: Beacon-based active media control interface in indoor ubiquitous computing environment. Cluster Comput. 19(1), 547–556 (2016)CrossRefGoogle Scholar
  12. 12.
    Huh, J.-H., Otgonchimeg, S., Seo, K.: Advanced metering infrastructure design and test bed experiment using intelligent agents: focusing on the PLC network base technology for Smart Grid system. J. Supercomput. 72(5), 1862–1877 (2016)CrossRefGoogle Scholar
  13. 13.
    Cheong, H., Eun, J., Kim, H., Kim, K.: Belief propagation decoding assisted on-the-fly Gaussian elimination for short LT codes. Cluster comput. 19(1), 309–314 (2016)CrossRefGoogle Scholar
  14. 14.
    Huynh, C.V., Kim, J., Huh, J.H.: Improving the B+-tree construction for transaction log data in bank system using Hadoop. International Conference on Information Science and Applications (ICISA 2017). LNEE, vol. 424, pp. 519–525. Springer, Singapore (2017)CrossRefGoogle Scholar
  15. 15.
    Zhou, W., Lu, J., Luan, Z., Wang, S., Xue, G., Yao, S.: SNB-index: a SkipNet and B+ tree based auxiliary cloud index. Cluster comput. 17(2), 453–462 (2014)CrossRefGoogle Scholar
  16. 16.
    Viglas, S.D.: Adapting the B+-tree for asymmetric I/O. In: East European Conference on Advances in Databases and Information Systems, pp. 399-412. Springer, Berlin, Heidelberg (2012)Google Scholar
  17. 17.
    Abdullahi, A.U., Ahmad, R., Zakaria, M.N.: Experimental performance analysis of B+-trees with Big Data indexing potentials. In: International Conference of Reliable Information and Communication Technology, pp. 20-29. Springer, New York (2017)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Department of IT, FPT UniversityHo Chi Minh City, Socialist Republic of VietnamHanoiVietnam
  2. 2.Department of SoftwareCatholic University of PusanBusanRepublic of Korea

Personalised recommendations