GraphDB – Storing Large Graphs on Secondary Memory

  • Lucas Fonseca NavarroEmail author
  • Ana Paula Appel
  • Estevam Rafael Hruschka Junior
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 241)


The volume of complex network data has been exponentially increased in the last years madding graph mining area the focus of a lot of research efforts. Most algorithms for mining this kind of data assume, however, that the complex network fits in primary memory. Unfortunately, such assumption is not always true. Even considering that, in some cases, using big computer clusters (in a MapReduce fashion, for instance) might be a suitable way to circumvent part of the difficulties of mining big data, efficiently storing and retrieving complex network data is still a great challenge. Thus the main goal of this work is to introduce the definition of a new data structure, called GraphDB-tree that can be used to efficiently store and retrieve complex networks, and also, allowing efficient queries in large complex networks.


Complex Network Query Time Link Prediction Large Graph Graph Database 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kyrola, A., Blelloch, G., Guestrin, C.: Graphchi: large-scale graph computation on just a pc. In: Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI 2012, pp. 31–46. USENIX Association, Berkeley (2012)Google Scholar
  2. 2.
    Traina Jr., C., Traina, A.J.M., Seeger, B., Faloutsos, C.: Slim-trees: High performance metric trees minimizing overlap between nodes. In: Zaniolo, C., Grust, T., Scholl, M.H., Lockemann, P.C. (eds.) EDBT 2000. LNCS, vol. 1777, pp. 51–65. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  3. 3.
    Appel, A.P., Hruschka Jr., E.R.: Centaurs a component based framework to mine large graphs. In: XXV Brazilian Symposium on Databases, Belo Horizonte, MG, Brazil, pp. 1–8 (2010)Google Scholar
  4. 4.
    Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Hruschka Jr., E.R., Mitchell, T.M.: Toward an architecture for never-ending language learning. In: Proceedings of the Twenty-Fourth Conference on Artificial Intelligence, AAAI 2010 (2010)Google Scholar
  5. 5.
    Appel, A.P., Hruschka Jr., E.R.: Prophet - a link-predictor to learn new rules on nell. In: 2011 IEEE 11th International Conference on Data Mining Workshops (ICDMW), Vancouver, BC, Canada, December 11, pp. 917–924 (2011)Google Scholar
  6. 6.
    Pereira, A.L., Appel, A.P.: Modeling and storing complex network with graph-tree. In: New Trends in Databases and Information Systems, Workshop Proceedings of the 16th East European Conference, ADBIS 2012, Pozna, Poland, September 17-21, pp. 305–315 (2012)Google Scholar
  7. 7.
    Angles, R., Gutierrez, C.: Survey of graph database models. ACM Comput. Surv. 40, 1:1–1:39 (2008)Google Scholar
  8. 8.
    Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRefGoogle Scholar
  9. 9.
    Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig latin: a not-so-foreign language for data processing. In: SIGMOD 2008: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1099–1110. ACM, New York (2008)CrossRefGoogle Scholar
  10. 10.
    Kang, U., Tsourakakis, C.E., Appel, A.P., Faloutsos, C., Leskovec, J.: Radius plots for mining tera-byte scale graphs: Algorithms, patterns, and observations. In: SIAM SDM, Columbus, Ohio, April 29- May 1, pp. 548–558 (2010)Google Scholar
  11. 11.
    Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., DeWitt, D.J., Madden, S., Stonebraker, M.: A comparison of approaches to large-scale data analysis. In: SIGMOD Conference, pp. 165–178. ACM (2009)Google Scholar
  12. 12.
    Vicknair, C., Macias, M., Zhao, Z., Nan, X., Chen, Y., Wilkins, D.: A comparison of a graph database and a relational database: a data provenance perspective. In: Proceedings of the 48th Annual Southeast Regional Conference, ACM SE 2010, pp. 42:1–42:6. ACM, New York (2010)Google Scholar
  13. 13.
    Weiss, C., Karras, P., Bernstein, A.: Hexastore: sextuple indexing for semantic web data management. Proc. VLDB Endow. 1(1), 1008–1019 (2008)Google Scholar
  14. 14.
    Sidirourgos, L., Goncalves, R., Kersten, M., Nes, N., Manegold, S.: Column-store support for rdf data management: not all swans are white. Proc. VLDB Endow. 1(2), 1553–1563 (2008)Google Scholar
  15. 15.
    Karypis, G., Kumar, V.: Parallel multilevel k-way partitioning for irregular graphs. SIAM Review 41(2), 278–300 (1999)MathSciNetzbMATHCrossRefGoogle Scholar
  16. 16.
    Watts, D.J., Strogatz, S.H.: Collective dynamics of small-world networks. Nature 393(6684), 440–442 (1998)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Lucas Fonseca Navarro
    • 1
    Email author
  • Ana Paula Appel
    • 2
  • Estevam Rafael Hruschka Junior
    • 1
  1. 1.Universidade Federal de São CarlosSão CarlosBrazil
  2. 2.IBM Research BrazilSao PauloBrazil

Personalised recommendations