Abstract
In contrast to heavy-handed ER-style data models in relational databases, knowledge graphs (or graph databases) capture entity semantics in terms of entity relationships and properties following a simple collect-as-you-go model. While this allows for a more flexible and dynamically adaptable knowledge representation, it comes at the price of more complex querying: with varying degrees of information sparsity, it will gradually become more difficult to figure out what an entity actually represents. Thus, matching the intended schema as specified by a query against actually occurring entity patterns in the graph database needs severe attention on a conceptual level. In this article, we analyze graph patterns as schema information from a graph pattern matching perspective. We argue that every query consists of a mixture of conceptual information (how entities are structured) together with evaluation information (further dependencies and constraints on data) and that this mixture is not always easy to divide. To arrive at truly schema-aware graph query processing, we propose several matching mechanisms, each mandating a specific semantic meaning of the graph pattern, and discuss their practical applicability.
Similar content being viewed by others
Notes
A preorder is a binary relation \(\mathbin{\sqsubseteq}\subseteq A\times A\) that is reflexive (i. e. for all \(a\in A\), \(a\sqsubseteq a\)) and transitive (i. e. for all \(a,b,c\in A\), \(a\sqsubseteq b\) and \(b\sqsubseteq c\) implies \(a\sqsubseteq c\)).
An equivalence relation is a preorder \(\mathbin{\equiv}\subseteq A\times A\) that is symmetric (i. e. for all \(a,b\in A\), \(a\equiv b\) implies \(b\equiv a\)).
References
Abedjan Z, Golab L, Naumann F (2015) Profiling relational data: a survey. Vldb J 24(4):557–581
Abiteboul S, Buneman P, Suciu D (1999) Data on the web: from relations to semistructured data and XML. Morgan Kaufmann, Burlington
Angles R, Gutierrez C (2005) Querying RDF data from a graph database perspective. In: Gómez-Pérez A, Euzenat J (eds) ESWC 2005, LNCS. Springer, Berlin, Heidelberg, pp 346–360 https://doi.org/10.1007/11431053_24
Angles R, Arenas M, Barceló P, Hogan A, Reutter J, Vrgoč D (2017) Foundations of modern query languages for graph databases. Acm Comput Surv 50(5):68:1–68:40. https://doi.org/10.1145/3104031
Brookes SD, Hoare CAR, Roscoe AW (1984) A theory of communicating sequential processes. JACM 31(3):560–599. https://doi.org/10.1145/828.833
Brynielsson J, Högberg J, Kaati L, Mårtenson C, Svenson P (2010) Detecting social positions using simulation. ASONAM 2010, pp 48–55 https://doi.org/10.1109/ASONAM.2010.52
Chen PPS (1976) The entity-relationship model – toward a unified view of data. ACM Trans Database Syst 1(1):9–36. https://doi.org/10.1145/320434.320440
Cook SA (1971) The complexity of theorem-proving procedures. Proceedings of the Third Annual ACM Symposium on Theory of Computing, STOC ’71. ACM, New York, pp 151–158 https://doi.org/10.1145/800157.805047
Dong X, Gabrilovich E, Heitz G, Horn W, Lao N, Murphy K, Strohmann T, Sun S, Zhang W (2014) Knowledge vault: a web-scale approach to probabilistic knowledge fusion. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14. ACM, New York, pp 601–610 https://doi.org/10.1145/2623330.2623623
Fan W (2012) Graph pattern matching revised for social network analysis. ICDT 2012. ACM, New York, pp 8–21 https://doi.org/10.1145/2274576.2274578
Fan W, Li J, Ma S, Tang N, Wu Y, Wu Y (2010) Graph pattern matching: from intractable to polynomial time. PVDLB Endow 3(1):264–275. https://doi.org/10.14778/1920841.1920878
Galárraga L, Razniewski S, Amarilli A, Suchanek FM (2017) Predicting completeness in knowledge bases. Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, WSDM ’17. ACM, New York, pp 375–383 https://doi.org/10.1145/3018661.3018739
van Glabbeek RJ (1990) The linear time - branching time spectrum. In: Baeten JCM, Klop JW (eds) CONCUR 1990. Springer, Berlin, Heidelberg, pp 278–297 https://doi.org/10.1007/BFb0039066
van Glabbeek R, Goltz U (1989) Equivalence notions for concurrent systems and refinement of actions. In: Kreczmar A, Mirkowska G (eds) Mathematical foundations of computer science 1989. Springer, Berlin, Heidelberg, pp 237–248
Glimm B, Krötzsch M (2010) Sparql beyond subgraph matching. The Semantic Web – ISWC 2010. Springer, Berlin, Heidelberg, pp 241–256
Hell P, Nešetřil J (1990) On the complexity of h‑coloring. J Comb Theory Ser B 48(1):92–110. https://doi.org/10.1016/0095-8956(90)90132-J
Henzinger M, Henzinger T, Kopke P (1995) Computing simulations on finite and infinite graphs. In: FOCS 1995. IEEE Computer Society, pp 453–462 https://doi.org/10.1109/SFCS.1995.492576
Homoceanu S, Balke WT (2015) A chip off the old block - extracting typical attributes for entities based on family resemblance. Database systems for advanced applications. Springer, Cham, pp 493–509
Homoceanu S, Wille P, Balke WT (2013) Proswip: property-based data access for semantic web interactive programming. Proceedings of the 12th International Semantic Web Conference - Part I, ISWC ’13. Springer, New York, pp 184–199 https://doi.org/10.1007/978-3-642-41335-3_12
Lee J, Han WS, Kasperovics R, Lee JH (2012) An in-depth comparison of subgraph isomorphism algorithms in graph databases. PVLDB Endow 6(2):133–144. https://doi.org/10.14778/2535568.2448946
Ma S, Cao Y, Fan W, Huai J, Wo T (2011) Capturing topology in graph pattern matching. PVLDB Endow 5(4):310–321. https://doi.org/10.14778/2095686.2095690
Ma S, Cao Y, Fan W, Huai J, Wo T (2014) Strong simulation: capturing topology in graph pattern matching. Acm Trans Database Syst 39(1):1–4. https://doi.org/10.1145/2528937
Mennicke S, Kalo JC, Balke WT (2017) Querying graph databases: what do graph patterns mean? Springer, Cham, pp 134–148 https://doi.org/10.1007/978-3-319-69904-2_11
Milner R (1971) An algebraic definition of simulation between programs. Proceedings of the 2Nd International Joint Conference on Artificial Intelligence, IJCAI’71. Morgan Kaufmann Publishers, San Francisco, pp 481–489
Mottin D, Lissandrini M, Velegrakis Y, Palpanas T (2016) Exemplar queries: a new way of searching. VLDB J 25(6):741–765. https://doi.org/10.1007/s00778-016-0429-2
Nardo LD, Ranzato F, Tapparo F (2009) The subgraph similarity problem. IEEE Trans Knowl Data Eng 21(5):748–749. https://doi.org/10.1109/TKDE.2008.205
Pérez J, Arenas M, Gutierrez C (2009) Semantics and complexity of sparql. Acm Trans Database Syst 34(3):16:1–16:45. https://doi.org/10.1145/1567274.1567278
Polyvyanyy A, Weidlich M, Weske M (2012) Isotactics as a foundation for alignment and abstraction of behavioral models. In: Barros A, Gal A, Kindler E (eds) Business process management. Springer, Berlin, Heidelberg, pp 335–351
Vardi MY (1982) The complexity of relational query languages (extended abstract). Proceedings of the Fourteenth Annual ACM Symposium on Theory of Computing, STOC ’82. ACM, New York, pp 137–146 https://doi.org/10.1145/800070.802186
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mennicke, S., Kalo, JC. & Balke, WT. Using Queries as Schema-Templates for Graph Databases. Datenbank Spektrum 18, 89–98 (2018). https://doi.org/10.1007/s13222-018-0286-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13222-018-0286-9