An Evolutionary Analysis of DBpedia Datasets
Linked Data, a method to publish interrelated data on the Semantic Web, has rapidly developed in recent years due to new techniques which enhance the availability of knowledge. As one of the most important central hubs of Linked Data, DBpedia is a large crowd-sourcing encyclopedia that contains diverse and multilingual knowledge from various domains in terms of RDF. Existing research has mostly focused on the basic characteristics of a specific version of the DBpedia datasets. Currently, we are not aware of any evolutionary analysis to understand the changes of DBpedia versions comprehensively. In this paper, we first present an overall evolutionary analysis in graph perspective. The evolution of DBpedia has been clarified based on the comparison of 6 versions of the datasets. Then we select two specific domains as subgraphs and calculate a series of metrics to illustrate the changes. Additionally, we carry out an evolutionary analysis of the interlinks between DBpedia and other Linked Data resources. According to our analysis, we find that although the growth of knowledge in DBpedia is an overall trend in recent five years, there does exist quite a few counter-intuitive results.
KeywordsDBpedia Evolutionary analysis RDF graph
This work is supported by the National Natural Science Foundation of China (61572353), the Natural Science Foundation of Tianjin (17JCYBJC15400), and the National Training Programs of Innovation and Entrepreneurship for Undergraduates (201710056091).
- 2.Ceccarello, M., Pietracaprina, A., Pucci, G., et al.: Space and time efficient parallel graph decomposition, clustering, and diameter approximation. In: SPAA 2015, pp. 182–191 (2015)Google Scholar
- 5.Jens, L., Robert, I., Max, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web J. 6(2), 167–195 (2015)Google Scholar
- 6.Klyne, G., Carroll, J.J.: Resource description framework (RDF): concepts and abstract syntax (2006)Google Scholar
- 8.Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. Technical report, Stanford InfoLab (1999)Google Scholar
- 9.Rodriguez, M.A.: A graph analysis of the linked data cloud. arXiv preprint arXiv:0903.0194 (2009)
- 10.Scott, J.: Social Network Analysis. Sage, Newcastle upon Tyne (2017)Google Scholar
- 11.Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: WWW 2007, pp. 697–706 (2007)Google Scholar