Advertisement

Query Operators for Comparing Uncertain Graphs

  • Denis DimitrovEmail author
  • Lisa Singh
  • Janet Mann
Chapter
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8980)

Abstract

Extending graph models to incorporate uncertainty is important for many applications, including citation networks, disease transmission networks, social networks, and observational networks. These networks may have existence probabilities associated with nodes or edges, as well as probabilities associated with attribute values of nodes or edges. Comparison of graphs and subgraphs is challenging without probabilities. When considering uncertainty of different graph elements and attributes, traditional graph operators and semantics are insufficient. In this paper, we present a prototype SQL-like graph query language that focuses on operators for querying and comparing uncertain graphs and subgraphs. Two interesting operators include ego neighborhood similarity and semantic path similarity. Similarity operators are particularly useful for comparison queries, the focus of this paper. After motivating and describing our operators, we present an implementation of a query engine that uses this query language. This implementation combines a layered and service-oriented architecture and is designed to be extensible, so that simple operators can be used as building blocks for more complex ones. We demonstrate the utility of our query language and operators for analyzing uncertain graphs based on two real world networks, a dolphin observation network and a citation network. Finally, we conduct a performance evaluation of some of the more complex operators, illustrating the viability of these operators for analysis of larger graphs.

Keywords

Graph query language Comparison queries Similarity queries Uncertain graphs 

Notes

Acknowledgments

This work was supported in part by the National Science Foundation Grant Nbrs. 0941487 and 0937070, and the Office of Naval Research Grant Nbr. 10230702.

References

  1. 1.
    ArangoDB graph database. http://www.arangodb.org/
  2. 2.
  3. 3.
    Gremlin language for graph traversal and manipulation. https://github.com/tinkerpop/gremlin/wiki
  4. 4.
    Neo4j graph database. http://neo4j.org/
  5. 5.
  6. 6.
    OrientDB document-graph DBMS. http://www.orientechnologies.com/
  7. 7.
  8. 8.
    Abiteboul, S., Quass, D., McHugh, J., Widom, J., Wiener, J.: The Lorel query language for semistructured data. Int. J. Digit. Libr. 1, 68–88 (1997)CrossRefGoogle Scholar
  9. 9.
    Angles, R., Gutierrez, C.: Survey of graph database models. ACM Comput. Surv. 40, 1:1–1:39 (2008)CrossRefGoogle Scholar
  10. 10.
    Cesario, N., Pang, A., Singh, L.: Visualizing node attribute uncertainty in graphs. In: SPIE Proceedings on Visualization and Data Analysis (2011)Google Scholar
  11. 11.
    Dimitrov, D., Singh, L., Mann, J.: Comparison queries for uncertain graphs. In: Decker, H., Lhotská, L., Link, S., Basl, J., Tjoa, A.M. (eds.) DEXA 2013, Part II. LNCS, vol. 8056, pp. 124–140. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  12. 12.
    Dimitrov, D., Singh, L., Mann, J.: A process-centric data mining and visual analytic tool for exploring complex social networks. In: IDEA (2013)Google Scholar
  13. 13.
    Fortin, S.: The graph isomorphism problem. Technical Report TR96-20, Department of Computer Science, University of Alberta (1996)Google Scholar
  14. 14.
    Güting, R.H.: GraphDB: modeling and querying graphs in databases. In: VLDB (1994)Google Scholar
  15. 15.
    He, H., Singh, A.K.: Graphs-at-a-time: query language and access methods for graph databases. In: ACM SIGMOD (2008)Google Scholar
  16. 16.
    Jin, R., Liu, L., Aggarwal, C.C.: Discovering highly reliable subgraphs in uncertain graphs. In: ACM SIGKDD (2011)Google Scholar
  17. 17.
    Jin, R., Liu, L., Ding, B., Wang, H.: Distance-constraint reachability computation in uncertain graphs. Proc. VLDB Endow. 4(9), 551–562 (2011)CrossRefGoogle Scholar
  18. 18.
    Koch, C.: MayBMS: a system for managing large uncertain and probabilistic databases. In: Aggarwal, C.C. (ed.) Managing and Mining Uncertain Data. Springer, New York (2009)Google Scholar
  19. 19.
    Mann, J., Sargeant, B.L., Watson-Capps, J.J., Gibson, Q.A., Heithaus, M.R., Connor, R.C., Patterson, E.: Why do dolphins carry sponges? PLoS ONE 3(12), e3868 (2008)CrossRefGoogle Scholar
  20. 20.
    Mann, J., Stanton, M., Patterson, E., Bienestock, E., Singh, L.: Social networks reveal cultural behaviour in tool using dolphins. Nature Commun. 3 (2012). http://www.nature.com/ncomms/journal/v3/n7/full/ncomms1983.html
  21. 21.
    Mann, J., Shark Bay Research Team: Shark bay dolphin project (2011). http://www.monkeymiadolphins.org
  22. 22.
    Moustafa, W.E., Kimmig, A., Deshpande, A., Getoor, L.: Subgraph pattern matching over uncertain graphs with identity linkage uncertainty. CoRR, abs/1305.7006 (2013)Google Scholar
  23. 23.
    Papapetrou, O., Ioannou, E., Skoutas, D.: Efficient discovery of frequent subgraph patterns in uncertain graph databases. In: EDBT/ICDT (2011)Google Scholar
  24. 24.
    Potamias, M., Bonchi, F., Gionis, A., Kollios, G.: k-nearest neighbors in uncertain graphs. Proc. VLDB Endow. 3, 997–1008 (2010)CrossRefGoogle Scholar
  25. 25.
    Prud’hommeaux, E., Seaborne, A.: SPARQL query language for RDF. W3C recommendation 15 (2008)Google Scholar
  26. 26.
    Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vision 40, 99–121 (2000)CrossRefzbMATHGoogle Scholar
  27. 27.
    Sen, P., Deshpande, A., Getoor, L.: Prdb: managing and exploiting rich correlations in probabilistic databases. VLDB J. 18, 1065–1090 (2009). Special issue on uncertain and probabilistic databasesCrossRefGoogle Scholar
  28. 28.
    Sen, P., Namata, G.M., Bilgic, M., Getoor, L., Gallagher, B., Eliassi-Rad, T.: Collective classification in network data. AI Mag. 29(3), 93–106 (2008)Google Scholar
  29. 29.
    Sharara, H., Sopan, A., Namata, G., Getoor, L., Singh, L.: G-PARE: a visual analytic tool for comparative analysis of uncertain graphs. In: IEEE VAST (2011)Google Scholar
  30. 30.
    Shasha, D., Wang, J.T.L., Giugno, R.: Algorithmics and applications of tree and graph searching. In: PODS (2002)Google Scholar
  31. 31.
    Singh, L., Beard, M., Getoor, L., Blake, M.B.: Visual mining of multi-modal social networks at different abstraction levels. In: Information Visualization (2007)Google Scholar
  32. 32.
    Singh, S., Mayfield, C., Mittal, S., Prabhakar, S., Hambrusch, S., Shah, R.: Orion 2.0: native support for uncertain data. In: ACM SIGMOD. ACM (2008)Google Scholar
  33. 33.
    Smolker, R.A., Richards, A.F., Connor, R.C., Mann, J., Berggren, P.: Sponge-carrying by Indian Ocean bottlenose dolphins: possible tool-use by a delphinid. Ethology 103, 454–465 (1997)CrossRefGoogle Scholar
  34. 34.
    Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1994)CrossRefGoogle Scholar
  35. 35.
    Widom, J.: Trio: a system for data, uncertainty, and lineage. In: Aggarwal, C.C. (ed.) Managing and Mining Uncertain Data. Springer, New York (2009)Google Scholar
  36. 36.
    Yuan, Y., Chen, L., Wang, G.: Efficiently answering probability threshold-based shortest path queries over uncertain graphs. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds.) DASFAA 2010. LNCS, vol. 5981, pp. 155–170. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  37. 37.
    Yuan, Y., Wang, G., Chen, L., Wang, H.: Efficient subgraph similarity search on large probabilistic graph databases. PVLDB 5(9), 800–811 (2012)Google Scholar
  38. 38.
    Yuan, Y., Wang, G., Chen, L., Wang, H.: Efficient keyword search on uncertain graph data. IEEE Trans. Knowl. Data Eng. 25(12), 2767–2779 (2013)CrossRefGoogle Scholar
  39. 39.
    Zhou, H., Shaverdian, A.A., Jagadish, H.V., Michailidis, G.: Querying graphs with uncertain predicates. In: ACM Workshop on Mining and Learning with Graphs (2010)Google Scholar
  40. 40.
    Zhu, Y., Qin, L., Yu, J.X., Cheng, H.: Finding top-k similar graphs in graph databases. In: EDBT (2012)Google Scholar
  41. 41.
    Zou, Z., Gao, H., Li, J.: Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2010, pp. 633–642. ACM, New York (2010)Google Scholar
  42. 42.
    Zou, Z., Li, J., Gao, H., Zhang, S.: Finding top-k maximal cliques in an uncertain graph. In: ICDE (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Georgetown UniversityWashington, DCUSA

Personalised recommendations