Skip to main content

Graph Processing with Massive Datasets: A Kel Primer

  • Chapter
  • First Online:
Big Data Technologies and Applications

Abstract

Graph theory and the study of networks can be traced back to Leonhard Euler’s original paper on the Seven Bridges of Konigsberg, in 1736 [1]. Although the mathematical foundations to understanding graphs have been laid out over the last few centuries [24], it wasn’t until recently, with the advent of modern computers, that parsing and analysis of large-scale graphs became tractable [5]. In the last decade, graph theory gained mainstream popularity following the adoption of graph models for new applications domains, including social networks and the web of data, both generating extremely large and dynamic graphs that cannot be adequately handled by legacy graph management applications [6].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Extract Transform and Load; a generic term in the data industry for data manipulation that occurs prior to the exercise of real interest.

  2. 2.

    Examples of ‘problem’ are minimal closure, shortest path, subgraph-matching, graph-isomorphism, etc.

  3. 3.

    Based upon a very early prototype of a small sub-set of the proposed KEL language.

  4. 4.

    In PIG you do not need to provide data types or even declare the schema for data although the PIG manual warns “it may not work very well if you don’t”.

  5. 5.

    The man that created BCPL which eventually led to C; and a great ‘Compiler Theory’ lecturer too!.

  6. 6.

    In fact some of the elements and benefits of both appear within KEL. However, at all points we believe that the thought process of the encoder should be paramount rather than the academic purity of a particular abstraction.

  7. 7.

    The PARSE format allows entities and facts which have been extracted from text files to appear within the knowledge base.

  8. 8.

    Here the UID does not exist and so is generated based upon those fields

  9. 9.

    Here the UID is existing and called UID in the underlying data

  10. 10.

    Once an entity with a UID has been declared then the type can be used to implicitly declare a foreign key existing within another part of the data.

  11. 11.

    Called allows for both bi-directional and unidirectional links to be used in text

  12. 12.

    There is a separate category of graph function which returns scalar results; these are covered by the syntax discussed already.

  13. 13.

    At the moment it is envisaged that the outputs of an algorithm are only recorded in the algorithm declaration and implicitly appear as #1, #2 etc. in the production. This may prove too sloppy if algorithms with many, many outputs are invented.

References

  1. http://en.wikipedia.org/wiki/Seven_Bridges_of_K%C3%B6nigsberg.

  2. Euler L. Solutio Problematis ad Geometriam Situs Pertinentis. Novi Commentarii Academiae Scientarium Imperialis Petropolitanque 7(1758–59), 9–28.

    Google Scholar 

  3. Hierholzer C. Uber die Moglichkeit, einen Linienzug ohne Wiederholung und ohne Unterbrechnung zu umfahren. Math Ann. 1873;6:30–2.

    Article  MathSciNet  Google Scholar 

  4. Biggs NL, et al. Graph theory 1736–1936. Oxford: Clarendon Press; 1986.

    MATH  Google Scholar 

  5. Agnarsson G. Graph theory: modeling, applications, and algorithms. Upper Saddle River: Prentice Hall; 2006.

    MATH  Google Scholar 

  6. Cudre-Mauroux P et al Graph data management systems for new application domains. In: Proceedings of the VLDB Endowment, vol 4, No 12; 2011.

    Google Scholar 

  7. Vicknair C et al. A comparison of a graph database and a relational database. ACMSE ’10, April 15–17, Oxford, MS, USA; 2010

    Google Scholar 

  8. Yang X et al. Summary graphs for relational database schemas. In: Proceedings of the VLDB Endowment, vol 4, No 12; 2011.

    Google Scholar 

  9. Shao B et al. Managing and mining large graphs: systems and implementations. SIGMOD ’12, May 20–24, Scottsdale, Arizona, USA; 2012.

    Google Scholar 

  10. http://en.wikipedia.org/wiki/Social_network_analysis.

  11. Singla P et al. Yes, there is a correlation—from social networks to personal behavior on the web. WWW 2008, April 21–25, Beijing, China; 2008.

    Google Scholar 

  12. Malm A, et al. Social network and distance correlates of criminal associates involved in Illicit Drug Production. Secur J. 2008;21:77–94. doi:10.1057/palgrave.sj.8350069.

    Article  Google Scholar 

  13. Latour J. Understanding consumer behavior through data analysis and simulation: Are Social Networks changing the World economy? Master Thesis. http://essay.utwente.nl/58146/.

  14. Averbuch A et al. Partitioning graph databases—a quantitative evaluation. Master of Science Thesis Stockholm, Sweden; 2010. arXiv:1301.5121.

  15. Plantikow S et al. Latency-optimal walks in replicated and partitioned graphs. In: DASFAA Workshops 2011, LNCS 6637, pp 14–27; 2011.

    Google Scholar 

  16. Middleton A. Data-intensive technologies for cloud computing. In: Handbook of cloud computing. Berlin: Springer; 2010

    Google Scholar 

  17. http://hpccsystems.com/blog/adventures-graphland-v-graphland-gets-reality-check.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Bayliss, D., Villanustre, F. (2016). Graph Processing with Massive Datasets: A Kel Primer. In: Big Data Technologies and Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-44550-2_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-44550-2_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-44548-9

  • Online ISBN: 978-3-319-44550-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics