Advertisement

Computing

pp 1–24 | Cite as

Hinode: implementing a vertex-centric modelling approach to maintaining historical graph data

  • Andreas KosmatopoulosEmail author
  • Anastasios Gounaris
  • Kostas Tsichlas
Article
  • 18 Downloads

Abstract

Over the past few years, there has been a rapid increase of data originating from evolving networks such as social networks, sensor networks and others. A major challenge that arises when handling such networks and their respective graphs is the ability to issue a historical query on their data, that is, a query that is concerned with the state of the graph at previous time instances. While there has been a number of works that index the historical data in a time-centric manner (i.e. according to the time instance an update event occurs), in this work, we focus on the less-explored vertex-centric storage approach (i.e. according to the entity in which an update event occurs). We demonstrate that the design choices for a vertex-centric model are not trivial, by proposing two different modelling and storage models that leverage NoSQL technology and investigating their tradeoffs. More specifically, we experimentally evaluate the two models and show that under certain cases, their relative performance can differ by several times. Finally, we provide evidence that simple baseline and non-NoSQL solutions are slower by up to an order of magnitude.

Keywords

Historical queries Historical graphs Evolving graphs Vertex-centric 

Notes

References

  1. 1.
    Akiba T, Iwata Y, Yoshida Y (2014) Dynamic and historical shortest-path distance queries on large evolving networks by pruned landmark labeling. In: 23rd international world wide web conference, WWW’14, pp 237–248Google Scholar
  2. 2.
    Apache Giraph. http://giraph.apache.org/. Accessed 12 July 2018
  3. 3.
    Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Gonzalez JE, Xin RS, Dave A, Crankshaw D, Franklin MJ, Stoica I (2014) Graphx: graph processing in a distributed dataflow framework. OSDI 14:599–613Google Scholar
  5. 5.
    Huo W, Tsotras VJ (2014) Efficient temporal shortest path queries on evolving social graphs. In: Conference on scientific and statistical database management, SSDBM ’14, pp 38:1–38:4Google Scholar
  6. 6.
    Khurana U, Deshpande A (2013) Efficient snapshot retrieval over historical graph data. In: 29th IEEE international conference on data engineering, ICDE 2013, Brisbane, April 8–12, pp 997–1008Google Scholar
  7. 7.
    Khurana U, Deshpande A (2016) Storing and analyzing historical graph data at scale. In: Proceedings of the 19th international conference on extending database technology, EDBT 2016, pp 65–76Google Scholar
  8. 8.
    Kosmatopoulos A, Giannakopoulou K, Papadopoulos AN, Tsichlas K (2016) An overview of methods for handling evolving graph sequences. In: Algorithmic aspects of cloud computing, pp 181–192. Springer, BerlinGoogle Scholar
  9. 9.
    Kosmatopoulos A, Tsichlas K, Gounaris A, Sioutas S, Pitoura E (2017) Hinode: an asymptotically space-optimal storage model for historical queries on graphs. Distrib Parallel Databases 35:249.  https://doi.org/10.1007/s10619-017-7207-z CrossRefGoogle Scholar
  10. 10.
    Labouseur AG, Birnbaum J, Olsen PW, Spillane SR, Vijayan J, Hwang J, Han W (2015) The g* graph database: efficiently managing large distributed dynamic graphs. Distrib and Parallel Databases 33(4):479–514CrossRefGoogle Scholar
  11. 11.
    Leskovec J, Krevl A (2014) SNAP datasets: Stanford large network dataset collection. http://snap.stanford.edu/data
  12. 12.
    Malewicz G, Austern MH, Bik AJ, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data, pp 135–146. ACMGoogle Scholar
  13. 13.
    Ren C, Lo E, Kao B, Zhu X, Cheng R (2011) On querying historical evolving graph sequences. PVLDB 4(11):726–737Google Scholar
  14. 14.
    Salzberg B, Tsotras VJ (1999) Comparison of access methods for time-evolving data. ACM Comput Surv (CSUR) 31(2):158–221CrossRefGoogle Scholar
  15. 15.
    Semertzidis K, Pitoura E (2016) Durable graph pattern queries on historical graphs. In: 32nd IEEE international conference on data engineering, ICDE 2016, Helsinki, May 16–20, 2016, pp 541–552Google Scholar
  16. 16.
    Semertzidis K, Pitoura E, Lillis K (2015) Timereach: historical reachability queries on evolving graphs. In: Proceedings of the 18th international conference on extending database technology, EDBT 2015, Brussels, Belgium, March 23–27, pp 121–132Google Scholar
  17. 17.
    Shao B, Wang H, Li Y (2013) Trinity: a distributed graph engine on a memory cloud. In: Proceedings of the ACM SIGMOD international conference on management of data, SIGMOD 2013, pp 505–516Google Scholar
  18. 18.
    Spillane SR, Birnbaum J, Bokser D, Kemp D, Labouseur AG, Olsen PW, Vijayan J, Hwang J, Yoon J (2013) A demonstration of the \(\text{G}_{\ast }\) graph database system. In: 29th IEEE international conference on data engineering, ICDE 2013, Brisbane, April 8–12, pp 1356–1359Google Scholar
  19. 19.
    Yang Y, Yu JX, Gao H, Pei J, Li J (2014) Mining most frequently changing component in evolving graphs. World Wide Web 17(3):351–376CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Austria, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of InformaticsAristotle University of ThessalonikiThessaloníkiGreece

Personalised recommendations