Abstract
Documents semantic representations built from open Knowledge Graphs (KGs) have proven to be beneficial in tasks such as recommendation, user profiling, and document retrieval. Broadly speaking, a semantic representation of a document can be defined as a graph whose nodes represent concepts and whose edges represent the semantic relationships between them. Fine-grained information about the concepts found in the KGs (e.g. DBpedia, YAGO, BabelNet) can be exploited to enrich and refine the representation. Although this kind of semantic representation is a graph, most applications that compare semantic representations reduce this graph to a “flattened” concept-weight representation and use existing well-known vector similarity measures. Consequently, relevant information related to the graph structure is not exploited. In this paper, different graph-based similarity measures are adapted to semantic representation graphs and are implemented and evaluated. Experiments performed on two datasets reveal better results when using the graph similarity measures than when using vector similarity measures. This paper presents the conceptual background, the adapted measures and their evaluation and ends with some conclusions on the threshold between precision and computational complexity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
Hereafter, we use concept and entity interchangeably to refer to resources of the KG.
- 5.
- 6.
References
Bunke, H.: Recent developments in graph matching. In: Proceedings 15th International Conference on Pattern Recognition, ICPR-2000, vol. 2, pp. 117–124 (2000)
Bunke, H., Shearer, K.: A graph distance metric based on the maximal common subgraph. Pattern Recognit. Lett. 19(3), 255–259 (1998)
Corcoglioniti, F., Dragoni, M., Rospocher, M., Aprosio, A.P.: Knowledge extraction for information retrieval. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 317–333. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-34129-3_20
Fankhauser, S., Riesen, K., Bunke, H.: Speeding up graph edit distance computation through fast bipartite matching. In: Jiang, X., Ferrer, M., Torsello, A. (eds.) GbRPR 2011. LNCS, vol. 6658, pp. 102–111. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20844-7_11
Färber, M., Ell, B., Menne, C., Rettinger, A.: A comparative survey of DBpedia, freebase, OpenCyc, Wikidata, and YAGO. Semant. Web J. 1–26 (2015)
Gao, X., Xiao, B., Tao, D., Li, X.: A survey of graph edit distance. Pattern Anal. Appl. 13(1), 113–129 (2010)
Hassan, S., Mihalcea, R.: Semantic relatedness using salient semantic analysis. In: AAAI (2011)
Jouili, S., Tabbone, S., Valveny, E.: Comparing graph similarity measures for graphical recognition. In: Ogier, J.-M., Liu, W., Lladós, J. (eds.) GREC 2009. LNCS, vol. 6020, pp. 37–48. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13728-0_4
Lee, M.D., Welsh, M.: An empirical evaluation of models of text document similarity. In: CogSci 2005, pp. 1254–1259. Erlbaum (2005)
Manrique, R., Herazo, O., Mariño, O.: Exploring the use of linked open data for user research interest modeling. In: Solano, A., Ordoñez, H. (eds.) CCC 2017. CCIS, vol. 735, pp. 3–16. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66562-7_1
Manrique, R., Mariño, O.: How does the size of a document affect linked open data user modeling strategies? In: Proceedings of the International Conference on Web Intelligence, WI 2017, pp. 1246–1252. ACM, New York (2017)
Manrique, R., Mariño, O.: Diversified semantic query reformulation. In: Różewski, P., Lange, C. (eds.) KESW 2017. CCIS, vol. 786, pp. 23–37. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69548-8_3
Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: Dbpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, I-Semantics 2011, pp. 1–8. ACM, New York (2011)
Musto, C., Lops, P., de Gemmis, M., Semeraro, G.: Semantics-aware recommender systems exploiting linked open data and graph-based features. Knowl.-Based Syst. 136, 1–14 (2017)
Nunes, B.P., Fetahu, B., Kawase, R., Dietze, S., Casanova, M.A., Maynard, D.: Interlinking documents based on semantic graphs with an application. In: Tweedale, J.W., Jain, L.C., Watada, J., Howlett, R.J. (eds.) Knowledge-Based Information Systems in Practice. SIST, vol. 30, pp. 139–155. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-13545-8_9
Papadimitriou, P., Dasdan, A., Garcia-Molina, H.: Web graph similarity for anomaly detection. J. Internet Serv. Appl. 1(1), 19–30 (2010)
Piao, G., Breslin, J.G.: Analyzing aggregated semantics-enabled user modeling on google+ and twitter for personalized link recommendations. In: Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization, UMAP 2016, pp. 105–109. ACM, New York (2016)
Schuhmacher, M., Ponzetto, S.P.: Knowledge-based graph document modeling. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining, WSDM 2014, pp. 543–552. ACM, New York (2014)
Sugiyama, K., Kan, M.Y.: A comprehensive evaluation of scholarly paper recommendation using potential citation papers. Int. J. Digit. Libr. 16(2), 91–109 (2015)
Waitelonis, J., Exeler, C., Sack, H.: Enabled generalized vector space model to improve document retrieval. In: Proceedings of the Third NLP & DBpedia Workshop (NLP & DBpedia 2015) Co-located with the 14th International Semantic Web Conference 2015 (ISWC 2015), 11 October 2015, Bethlehem, Pennsylvania, USA, pp. 33–44 (2015)
Willett, P.: Matching of chemical and biological structures using subgraph and maximal common subgraph isomorphism algorithms. In: Truhlar, D.G., Howe, W.J., Hopfinger, A.J., Blaney, J., Dammkoehler, R.A. (eds.) Rational Drug Design, vol. 108, pp. 11–38. Springer, New York (1999). https://doi.org/10.1007/978-1-4612-1480-9_3
Acknowledgment
This work was partially supported by COLCIENCIAS PhD scholarship (Call 647-2014).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Manrique, R., Cueto-Ramirez, F., Mariño, O. (2018). Comparing Graph Similarity Measures for Semantic Representations of Documents. In: Serrano C., J., Martínez-Santos, J. (eds) Advances in Computing. CCC 2018. Communications in Computer and Information Science, vol 885. Springer, Cham. https://doi.org/10.1007/978-3-319-98998-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-98998-3_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98997-6
Online ISBN: 978-3-319-98998-3
eBook Packages: Computer ScienceComputer Science (R0)