Abstract
Graph theory and the study of networks can be traced back to Leonhard Euler’s original paper on the Seven Bridges of Konigsberg, in 1736 [1]. Although the mathematical foundations to understanding graphs have been laid out over the last few centuries [2–4], it wasn’t until recently, with the advent of modern computers, that parsing and analysis of large-scale graphs became tractable [5]. In the last decade, graph theory gained mainstream popularity following the adoption of graph models for new applications domains, including social networks and the web of data, both generating extremely large and dynamic graphs that cannot be adequately handled by legacy graph management applications [6].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Extract Transform and Load; a generic term in the data industry for data manipulation that occurs prior to the exercise of real interest.
- 2.
Examples of ‘problem’ are minimal closure, shortest path, subgraph-matching, graph-isomorphism, etc.
- 3.
Based upon a very early prototype of a small sub-set of the proposed KEL language.
- 4.
In PIG you do not need to provide data types or even declare the schema for data although the PIG manual warns “it may not work very well if you don’t”.
- 5.
The man that created BCPL which eventually led to C; and a great ‘Compiler Theory’ lecturer too!.
- 6.
In fact some of the elements and benefits of both appear within KEL. However, at all points we believe that the thought process of the encoder should be paramount rather than the academic purity of a particular abstraction.
- 7.
The PARSE format allows entities and facts which have been extracted from text files to appear within the knowledge base.
- 8.
Here the UID does not exist and so is generated based upon those fields
- 9.
Here the UID is existing and called UID in the underlying data
- 10.
Once an entity with a UID has been declared then the type can be used to implicitly declare a foreign key existing within another part of the data.
- 11.
Called allows for both bi-directional and unidirectional links to be used in text
- 12.
There is a separate category of graph function which returns scalar results; these are covered by the syntax discussed already.
- 13.
At the moment it is envisaged that the outputs of an algorithm are only recorded in the algorithm declaration and implicitly appear as #1, #2 etc. in the production. This may prove too sloppy if algorithms with many, many outputs are invented.
References
http://en.wikipedia.org/wiki/Seven_Bridges_of_K%C3%B6nigsberg.
Euler L. Solutio Problematis ad Geometriam Situs Pertinentis. Novi Commentarii Academiae Scientarium Imperialis Petropolitanque 7(1758–59), 9–28.
Hierholzer C. Uber die Moglichkeit, einen Linienzug ohne Wiederholung und ohne Unterbrechnung zu umfahren. Math Ann. 1873;6:30–2.
Biggs NL, et al. Graph theory 1736–1936. Oxford: Clarendon Press; 1986.
Agnarsson G. Graph theory: modeling, applications, and algorithms. Upper Saddle River: Prentice Hall; 2006.
Cudre-Mauroux P et al Graph data management systems for new application domains. In: Proceedings of the VLDB Endowment, vol 4, No 12; 2011.
Vicknair C et al. A comparison of a graph database and a relational database. ACMSE ’10, April 15–17, Oxford, MS, USA; 2010
Yang X et al. Summary graphs for relational database schemas. In: Proceedings of the VLDB Endowment, vol 4, No 12; 2011.
Shao B et al. Managing and mining large graphs: systems and implementations. SIGMOD ’12, May 20–24, Scottsdale, Arizona, USA; 2012.
Singla P et al. Yes, there is a correlation—from social networks to personal behavior on the web. WWW 2008, April 21–25, Beijing, China; 2008.
Malm A, et al. Social network and distance correlates of criminal associates involved in Illicit Drug Production. Secur J. 2008;21:77–94. doi:10.1057/palgrave.sj.8350069.
Latour J. Understanding consumer behavior through data analysis and simulation: Are Social Networks changing the World economy? Master Thesis. http://essay.utwente.nl/58146/.
Averbuch A et al. Partitioning graph databases—a quantitative evaluation. Master of Science Thesis Stockholm, Sweden; 2010. arXiv:1301.5121.
Plantikow S et al. Latency-optimal walks in replicated and partitioned graphs. In: DASFAA Workshops 2011, LNCS 6637, pp 14–27; 2011.
Middleton A. Data-intensive technologies for cloud computing. In: Handbook of cloud computing. Berlin: Springer; 2010
http://hpccsystems.com/blog/adventures-graphland-v-graphland-gets-reality-check.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Bayliss, D., Villanustre, F. (2016). Graph Processing with Massive Datasets: A Kel Primer. In: Big Data Technologies and Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-44550-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-44550-2_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44548-9
Online ISBN: 978-3-319-44550-2
eBook Packages: Computer ScienceComputer Science (R0)