Abstract
With the emergence of large scale social networks such as Twitter, Facebook, Linkedin and Google+ the growing trends of big data is becoming much clearer. In addition to massive storage requirements for this highly connected big data, efficient mechanisms for processing this data are also needed. The inadequacy of traditional solutions such as relational database management systems for processing highly connected data causes the people to head towards graph databases. Graph databases are able to handle billions of nodes and relationships on a single machine but the high growing rate of social data are already pushing their limits. In this work, we consider partitioning of graph databases in order to increase throughput of a graph database system. For this purpose we design and implement a framework that both partitions a graph database and provides a fully functional distributed graph database system. We concentrate on access pattern based partitioning. In our experiments access pattern based partitioning outperforms unbiased partitioning that only depends on static structure of the graph. We evaluate our results on real world datasets of the Erdos Web-Graph Project and the Pokec social network.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Averbuch, A., Neumann, M.: Partitioning graph databases. Master’s thesis, KTH Computer Science and Communication (2010)
Rodriguez, M.A., Neubauer, P.: Constructions from Dots and Lines. CoRR, abs/1006.2361
Donath, W.E., Hoffman, A.J.: Lower bounds for the partitioning of graphs. IBM J. Res. Dev. 17, 420–425 (1973). doi:10.1147/rd.175.0420
Pratt, T.W., Friedman, D.P.: A language extension for graph processing and its formal semantics. Commun. ACM 14, 460–467 (1971). doi:10.1147/rd.175.0420
Angles, R., Gutierrez, C.: Survey of graph database models. ACM Comput. Surv. 40, 1:1–1:39 (2008). http://doi.acm.org/10.1145/1322432.1322433
Angles, R.: A comparison of current graph database models. In: ICDE Workshops, pp. 171–177 (2012)
Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 135–146. ACM, New York (2010). http://doi.acm.org/10.1145/1807167.1807184
Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33, 103–111 (1990). http://doi.acm.org/10.1145/79173.79181
Chairunnanda, P., Forsyth, S., Daudjee, K.: Graph data partition models for online social networks. In: Proceedings of the 23rd ACM Conference on Hypertext and Social Media, HT 2012, pp. 175–180. ACM, New York (2012). http://doi.acm.org/10.1145/2309996.2310026
Ho, L.-Y., Wu, J.-J., Liu, P.: Distributed graph database for large-scale social computing. In: 2012 IEEE 5th International Conference on Cloud Comput (CLOUD), pp. 455–462, June 2012
Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20, 359–392 (1998). http://epubs.siam.org/doi/abs/10.1137/S1064827595287997
Pellegrini, F., Roman, J.: Scotch: a software package for static mapping by dual recursive bipartitioning of process and architecture graphs. In: Liddell, H., Colbrook, A., Hertzberger, B., Sloot, P. (eds.) HPCN-Europe 1996. LNCS, vol. 1067, pp. 493–498. Springer, Heidelberg (1996). http://dx.doi.org/10.1007/3-540-61142-8_588
Takac, L., Zabovsky, M.: Data analysis in public social networks. In: International Scientific Conference and International Workshop Present Day Trends of Innovations, May 2012
Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13, 377–387 (1970)
Hendrickson, B., Leland, R.: A multilevel algorithm for partitioning graphs. In: Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM). ACM, New York (1995)
Ucar, B., Aykanat, C.: Revisiting hypergraph models for sparse matrix partitioning. SIAM Rev. 49, 595–603 (2007)
Ozkural, E., Ucar, B., Aykanat, C.: Parallel frequent item set mining with selective item replication. IEEE Transact. Parallel Distrib. Syst. 22, 1632–1640 (2011). Los Alamitos, CA, http://dx.doi.org/10.1109/TPDS.2011.32
Catalyurek, U.V., Boman, E.G., Devine, K.D., Bozdağ, D., Heaphy, R.T., Riesen, L.A.: A repartitioning hypergraph model for dynamic load balancing. J. Parallel Distrib. Comput. 69(8), 711–724 (2009). doi:10.1016/j.jpdc.2009.04.011
Karypis, G., Kumar, V.: METIS - Unstructured Graph Partitioning and Sparse Matrix Ordering System (1995). http://epubs.siam.org/doi/abs/10.1137/S1064827595287997
Boman, E., Devine, K., Heaphy, R., Hendrickson, B., Leung, V., Riesen, L.A., Vaughan, C., Catalyurek, U., Bozdag, D., Mitchell, W., Teresco, J.: Zoltan 3.0: Parallel Partitioning, Load Balancing, and Data-Management Services; User’s Guide. Sandia National Laboratories, Albuquerque, NM (2007
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Tüfekçi, V., Özturan, C. (2015). Partitioning Graph Databases by Using Access Patterns. In: Pop, F., Potop-Butucaru, M. (eds) Adaptive Resource Management and Scheduling for Cloud Computing. ARMS-CC 2015. Lecture Notes in Computer Science(), vol 9438. Springer, Cham. https://doi.org/10.1007/978-3-319-28448-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-28448-4_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28447-7
Online ISBN: 978-3-319-28448-4
eBook Packages: Computer ScienceComputer Science (R0)