Abstract
Extending graph models to incorporate uncertainty is important for many applications, including citation networks, disease transmission networks, social networks, and observational networks. These networks may have existence probabilities associated with nodes or edges, as well as probabilities associated with attribute values of nodes or edges. Comparison of graphs and subgraphs is challenging without probabilities. When considering uncertainty of different graph elements and attributes, traditional graph operators and semantics are insufficient. In this paper, we present a prototype SQL-like graph query language that focuses on operators for querying and comparing uncertain graphs and subgraphs. Two interesting operators include ego neighborhood similarity and semantic path similarity. Similarity operators are particularly useful for comparison queries, the focus of this paper. After motivating and describing our operators, we present an implementation of a query engine that uses this query language. This implementation combines a layered and service-oriented architecture and is designed to be extensible, so that simple operators can be used as building blocks for more complex ones. We demonstrate the utility of our query language and operators for analyzing uncertain graphs based on two real world networks, a dolphin observation network and a citation network. Finally, we conduct a performance evaluation of some of the more complex operators, illustrating the viability of these operators for analysis of larger graphs.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
The numbers do not add up to the total count of sponging and snacking dolphins, because the sex for some of these dolphins cannot be established with certainty.
References
ArangoDB graph database. http://www.arangodb.org/
DEX graph database. http://www.sparsity-technologies.com/dex
Gremlin language for graph traversal and manipulation. https://github.com/tinkerpop/gremlin/wiki
Neo4j graph database. http://neo4j.org/
Oracle spatial and graph option. http://www.oracle.com/technetwork/database-options/spatialandgraph/overview/index.html
OrientDB document-graph DBMS. http://www.orientechnologies.com/
Titan graph database. http://thinkaurelius.github.com/titan/
Abiteboul, S., Quass, D., McHugh, J., Widom, J., Wiener, J.: The Lorel query language for semistructured data. Int. J. Digit. Libr. 1, 68–88 (1997)
Angles, R., Gutierrez, C.: Survey of graph database models. ACM Comput. Surv. 40, 1:1–1:39 (2008)
Cesario, N., Pang, A., Singh, L.: Visualizing node attribute uncertainty in graphs. In: SPIE Proceedings on Visualization and Data Analysis (2011)
Dimitrov, D., Singh, L., Mann, J.: Comparison queries for uncertain graphs. In: Decker, H., Lhotská, L., Link, S., Basl, J., Tjoa, A.M. (eds.) DEXA 2013, Part II. LNCS, vol. 8056, pp. 124–140. Springer, Heidelberg (2013)
Dimitrov, D., Singh, L., Mann, J.: A process-centric data mining and visual analytic tool for exploring complex social networks. In: IDEA (2013)
Fortin, S.: The graph isomorphism problem. Technical Report TR96-20, Department of Computer Science, University of Alberta (1996)
Güting, R.H.: GraphDB: modeling and querying graphs in databases. In: VLDB (1994)
He, H., Singh, A.K.: Graphs-at-a-time: query language and access methods for graph databases. In: ACM SIGMOD (2008)
Jin, R., Liu, L., Aggarwal, C.C.: Discovering highly reliable subgraphs in uncertain graphs. In: ACM SIGKDD (2011)
Jin, R., Liu, L., Ding, B., Wang, H.: Distance-constraint reachability computation in uncertain graphs. Proc. VLDB Endow. 4(9), 551–562 (2011)
Koch, C.: MayBMS: a system for managing large uncertain and probabilistic databases. In: Aggarwal, C.C. (ed.) Managing and Mining Uncertain Data. Springer, New York (2009)
Mann, J., Sargeant, B.L., Watson-Capps, J.J., Gibson, Q.A., Heithaus, M.R., Connor, R.C., Patterson, E.: Why do dolphins carry sponges? PLoS ONE 3(12), e3868 (2008)
Mann, J., Stanton, M., Patterson, E., Bienestock, E., Singh, L.: Social networks reveal cultural behaviour in tool using dolphins. Nature Commun. 3 (2012). http://www.nature.com/ncomms/journal/v3/n7/full/ncomms1983.html
Mann, J., Shark Bay Research Team: Shark bay dolphin project (2011). http://www.monkeymiadolphins.org
Moustafa, W.E., Kimmig, A., Deshpande, A., Getoor, L.: Subgraph pattern matching over uncertain graphs with identity linkage uncertainty. CoRR, abs/1305.7006 (2013)
Papapetrou, O., Ioannou, E., Skoutas, D.: Efficient discovery of frequent subgraph patterns in uncertain graph databases. In: EDBT/ICDT (2011)
Potamias, M., Bonchi, F., Gionis, A., Kollios, G.: k-nearest neighbors in uncertain graphs. Proc. VLDB Endow. 3, 997–1008 (2010)
Prud’hommeaux, E., Seaborne, A.: SPARQL query language for RDF. W3C recommendation 15 (2008)
Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vision 40, 99–121 (2000)
Sen, P., Deshpande, A., Getoor, L.: Prdb: managing and exploiting rich correlations in probabilistic databases. VLDB J. 18, 1065–1090 (2009). Special issue on uncertain and probabilistic databases
Sen, P., Namata, G.M., Bilgic, M., Getoor, L., Gallagher, B., Eliassi-Rad, T.: Collective classification in network data. AI Mag. 29(3), 93–106 (2008)
Sharara, H., Sopan, A., Namata, G., Getoor, L., Singh, L.: G-PARE: a visual analytic tool for comparative analysis of uncertain graphs. In: IEEE VAST (2011)
Shasha, D., Wang, J.T.L., Giugno, R.: Algorithmics and applications of tree and graph searching. In: PODS (2002)
Singh, L., Beard, M., Getoor, L., Blake, M.B.: Visual mining of multi-modal social networks at different abstraction levels. In: Information Visualization (2007)
Singh, S., Mayfield, C., Mittal, S., Prabhakar, S., Hambrusch, S., Shah, R.: Orion 2.0: native support for uncertain data. In: ACM SIGMOD. ACM (2008)
Smolker, R.A., Richards, A.F., Connor, R.C., Mann, J., Berggren, P.: Sponge-carrying by Indian Ocean bottlenose dolphins: possible tool-use by a delphinid. Ethology 103, 454–465 (1997)
Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1994)
Widom, J.: Trio: a system for data, uncertainty, and lineage. In: Aggarwal, C.C. (ed.) Managing and Mining Uncertain Data. Springer, New York (2009)
Yuan, Y., Chen, L., Wang, G.: Efficiently answering probability threshold-based shortest path queries over uncertain graphs. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds.) DASFAA 2010. LNCS, vol. 5981, pp. 155–170. Springer, Heidelberg (2010)
Yuan, Y., Wang, G., Chen, L., Wang, H.: Efficient subgraph similarity search on large probabilistic graph databases. PVLDB 5(9), 800–811 (2012)
Yuan, Y., Wang, G., Chen, L., Wang, H.: Efficient keyword search on uncertain graph data. IEEE Trans. Knowl. Data Eng. 25(12), 2767–2779 (2013)
Zhou, H., Shaverdian, A.A., Jagadish, H.V., Michailidis, G.: Querying graphs with uncertain predicates. In: ACM Workshop on Mining and Learning with Graphs (2010)
Zhu, Y., Qin, L., Yu, J.X., Cheng, H.: Finding top-k similar graphs in graph databases. In: EDBT (2012)
Zou, Z., Gao, H., Li, J.: Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2010, pp. 633–642. ACM, New York (2010)
Zou, Z., Li, J., Gao, H., Zhang, S.: Finding top-k maximal cliques in an uncertain graph. In: ICDE (2010)
Acknowledgments
This work was supported in part by the National Science Foundation Grant Nbrs. 0941487 and 0937070, and the Office of Naval Research Grant Nbr. 10230702.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Dimitrov, D., Singh, L., Mann, J. (2015). Query Operators for Comparing Uncertain Graphs. In: Hameurlain, A., Küng, J., Wagner, R., Decker, H., Lhotska, L., Link, S. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems XVIII. Lecture Notes in Computer Science(), vol 8980. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46485-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-662-46485-4_5
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-46484-7
Online ISBN: 978-3-662-46485-4
eBook Packages: Computer ScienceComputer Science (R0)