An I/O-Efficient Buffer Batch Replacement Policy for Update-Intensive Graph Databases

Zhou, Ningnan; Zhou, Xuan; Zhang, Xiao; Wang, Shan; Liu, Ling

doi:10.1007/978-3-319-32049-6_15

An I/O-Efficient Buffer Batch Replacement Policy for Update-Intensive Graph Databases

Ningnan Zhou^19,20,
Xuan Zhou^19,20,
Xiao Zhang^19,20,
Shan Wang^19,20 &
…
Ling Liu²¹

Conference paper
First Online: 25 March 2016

1476 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9643))

Abstract

With the proliferation of graph based applications, such as social network management and Web structure mining, update-intensive graph databases have become an important component of today’s data management platforms. Several techniques have been recently proposed to exploit locality on both data organization and computational model in graph databases. However, little investigation has been conducted on buffer management of graph databases. To the best of our knowledge, current buffer managers of graph databases suffer performance loss caused by unnecessary random I/O access. To solve this problem, we develop a novel batch replacement policy for buffer management. This policy enables us to maximally exploit sequential I/O to improve the performance of graph database. To enable the policy, we devise a segment tree based buffer manager to efficiently maintains optimal replacement plan. Extensive experiments on real-world and synthetic datasets demonstrate the superiority of our method.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
For continence, the term “buffer manager” refers to the core component in the rest of the paper.
2.
http://neo4j.com/.
3.
https://github.com/graphchi/graphchiDB-scala.

References

Armstrong, T.G., Ponnekanti, V., Borthakur, D., Callaghan, M.: Linkbench: a database benchmark based on the facebook social graph. In: SIGMOD 2013, pp. 1185–1196
Google Scholar
Backstrom, L., Huttenlocher, D., Kleinberg, J., Lan, X.: Group formation in large social networks: membership, growth, and evolution. In: KDD 2006, pp. 44–54
Google Scholar
Bender, M.A., Demaine, E.D., Farach-Colton, M.: Cache-oblivious b-trees. SIAM J. Comput. 35(2), 341–358 (2005)
Article MathSciNet MATH Google Scholar
de Berg, M., Cheong, O., van Kreveld, M., Overmars, M.: Computational Geometry: Algorithms and Applications, 3rd edn. Springer-Verlag TELOS, Heidelberg (2008)
Book MATH Google Scholar
Bornea, M.A., Dolby, J., Kementsietsidis, A., Srinivas, K., Dantressangle, P., Udrea, O., Bhattacharjee, V.: Buildingan efficient RDF store over a relational database. In: SIGMOD 2013, pp. 121–132
Google Scholar
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. 30(1–7), 107–117 (1998)
Google Scholar
Effelsberg, W., Haerder, T.: Principles of database buffer management. ACM Trans. Database Syst. 9(4), 560–595 (1984)
Article Google Scholar
Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: Graphx: graph processing in a distributed dataflow framework. In: OSDI 2014, pp. 599–613
Google Scholar
Neo4j graph database. http://neo4j.com/
Titan graph database. http://thinkaurelius.github.io/titan/
Han, J., Wen, J.-R.: Mining frequent neighborhood patterns in a large labeled graph. In: CIKM 2013, pp. 259–268
Google Scholar
Han, J., Wen, J.-R., Pei, J.: Within-network classification using radius-constrained neighborhood patterns. In: CIKM 2014, pp. 1539–1548
Google Scholar
Han, W.-S., Lee, S., Park, K., Lee, J.-H., Kim, M.-S., Kim, J., Yu, V.: Turbograph: a fast parallel graph engine handlingbillion-scale graphs in a single PC. In: KDD 2013, pp. 77–85
Google Scholar
Kyrola, A., Blelloch, G., Guestrin, C.: Graphchi: large-scale graphcomputation on just a PC. In: Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI 2012, pp. 31–46
Google Scholar
Twitter Developer: Get Friends List. https://dev.twitter.com/rest/reference/get/friends/list
Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning and data mining in the cloud. In: PVLDB 2012
Google Scholar
Macko, P., Marathe, V.J., Margo, D.W., Seltzer, M.I.: LLAMA: efficient graph analytics using large multiversioned arrays. In: ICDE 2015, pp. 363–374
Google Scholar
Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scalegraph processing. In: SIGMOD 2010, pp. 135–146
Google Scholar
Martínez-Bazan, N., Muntés-Mulero, V., Gómez-Villamor, S., Nin, J., Sánchez-Martínez, M.-A., Larriba-Pey, J.-L.: Dex: high-performance exploration on large graphs for information retrieval. In: CIKM 2007, pp. 573–582
Google Scholar
O’Neil, E.J., O’Neil, P.E., Weikum, G.: An optimality proof of the LRU-K page replacement algorithm. J. ACM 46(1), 92–112 (1999)
Article MathSciNet MATH Google Scholar
O’Neil, P., Cheng, E., Gawlick, D., O’Neil, E.: The log-structured merge-tree (LSM-tree). Acta Inf. 33(4), 351–385 (1996)
Article MATH Google Scholar
Robinson, I., Webber, J., Eifrem, E.: Graph Databases. O’Reilly Media Inc., Sebastopol (2013)
Google Scholar
Roy, A., Bindschaedler, L., Malicevic, J., Zwaenepoel, W.: Chaos: scale-out graph processing from secondary storage. In: SOSP 2015, pp. 472–488
Google Scholar
Roy, A., Mihailovic, I., Zwaenepoel, W.: X-stream: edge-centricgraph processing using streaming partitions. In: SOSP 2013, pp. 472–488
Google Scholar
Rudolf, M., Paradies, M., Bornhövd, C., Lehner, W.: The graph story of the SAP HANA database. In: BTW 2013, pp. 403–420
Google Scholar
Shang, S., Ding, R., Yuan, B., Xie, K., Zheng, K., Kalnis, P.: User oriented trajectory search for trip recommendation. In: EDBT 2012, pp. 156–167
Google Scholar
Shang, S., Ding, R., Zheng, K., Jensen, C.S., Kalnis, P., Zhou, X.: Personalized trajectory matching in spatial networks. VLDB J. 23(3), 449–468 (2014)
Article Google Scholar
Shang, S., Yuan, B., Deng, K., Xie, K., Zheng, K., Zhou, X.: Pnn query processing on compressed trajectories. Geoinformatica 16(3), 467–496 (2012)
Article Google Scholar
Shao, B., Wang, H., Xiao, Y.: Managing and mining large graphs: systems and implementations. In: SIGMOD 2012, pp. 589–592
Google Scholar
Xia, Y., Tanase, I.G., Nai, L., Tan, W., Liu, Y., Crawford, J., Lin, C.-Y.: Graph analytics and storage. In: IEEE Big Data 2014, pp. 942–951
Google Scholar
Peters, J.F.: In: Peters, J.F. (ed.). ISRL, vol. 63, pp. 1–76. Springer, Heidelberg (2014)
Google Scholar
Zeng, K., Yang, J., Wang, H., Shao, B., Wang, Z.: A distributed graph engine for web scale RDF data. In: PVLDB 2013, pp. 265–276
Google Scholar
Zhou, C., Gao, J., Sun, B., Yu, J.X.: MOCgraph: scalable distributed graph processing using message online computing, pp. 377–388
Google Scholar
Zhou, Y., Liu, L., Lee, K., Zhang, Q.: GraphTwist: fast iterative graph computation with two-tier optimizations. In: PVLDB 2015, pp. 1262–1273
Google Scholar
Zhu, X., Han, W., Chen, W.: Gridgraph: large-scale graph processing on a single machine using 2-level hierarchical partitioning. In: USENIXATC 2015, pp. 375–386
Google Scholar

Download references

Acknowledgement

This work is partially funded by China Scholarship Council. Xuan Zhou’s research is supported by the National High-tech R&D Program (863 Program) (2015AA015307) and the NSFC Porject (No. 61272138). Ling Liu’s research is partially supported by the National Science Foundation under Grants IIS-0905493, CNS-1115375, IIP-1230740 and a grant from Intel ISTC on Cloud Computing.

Author information

Authors and Affiliations

MOE Key Laboratory of DEKE, Renmin University of China, Beijing, China
Ningnan Zhou, Xuan Zhou, Xiao Zhang & Shan Wang
School of Information, Renmin University of China, Beijing, 100872, China
Ningnan Zhou, Xuan Zhou, Xiao Zhang & Shan Wang
College of Computing, Georgia Institute of Technology, Atlanta, China
Ling Liu

Authors

Ningnan Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xuan Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ling Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xuan Zhou .

Editor information

Editors and Affiliations

Georgia Institute of Technology , Atlanta, Georgia, USA
Shamkant B. Navathe
University of Texas at Dallas , Richardson, Texas, USA
Weili Wu
University of Minnesota , Minneapolis, Minnesota, USA
Shashi Shekhar
Renmin University , Beijing, China
Xiaoyong Du
Fudan University , Shanghai, China
Sean X. Wang
Rutgers, The State University of New Jer , New Brunswick, New Jersey, USA
Hui Xiong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, N., Zhou, X., Zhang, X., Wang, S., Liu, L. (2016). An I/O-Efficient Buffer Batch Replacement Policy for Update-Intensive Graph Databases. In: Navathe, S., Wu, W., Shekhar, S., Du, X., Wang, S., Xiong, H. (eds) Database Systems for Advanced Applications. DASFAA 2016. Lecture Notes in Computer Science(), vol 9643. Springer, Cham. https://doi.org/10.1007/978-3-319-32049-6_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-32049-6_15
Published: 25 March 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-32048-9
Online ISBN: 978-3-319-32049-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics