An Analysis of Data Quality in DBpedia and Zhishi.me
Linked Data has experienced an accelerating growth since it was launched on 2006. While an increasing amount of RDF data is available on the web, errors also proliferate, thus the quality of Linked Data has drawn more and more public attention. Since the quality of data in some way affects the reliability and efficiency of web applications consuming Linked Data, demand for quality analysis of Linked Data becomes increasingly imperative. In this paper, we present some problems concerning the quality of Linked Data. These problems are discovered through our analysis on two cross-domain RDF datasets: DBpedia and Zhishi.me, both of which are based on automatic extraction of resources from existing encyclopedias. Some of the problems can be detected simply by SPARQL queries, while others cannot. For every problem listed in this paper, we present a method for the detection of it. Besides, we do experiments to demonstrate the validity of our methods.
KeywordsObject Property SPARQL Query Meaningful Unit Levenshtein Distance Parent Category
Unable to display preview. Download preview PDF.
- 4.Hogan, A., Harth, A., Passant, A., Decker, S., Polleres, A.: Weaving the pedantic web. In: LDOW (2010)Google Scholar
- 7.Niu, X., Sun, X., Wang, H., Rong, S., Qi, G., Yu, Y.: Zhishi.me - weaving chinese linking open data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part II. LNCS, vol. 7032, pp. 205–220. Springer, Heidelberg (2011)CrossRefGoogle Scholar
- 8.Péron, Y., Raimbault, F., Ménier, G., Marteau, P.F., et al.: On the detection of inconsistencies in rdf data sets and their correction at ontological level. In: ISWC (2011)Google Scholar
- 9.Rahm, E., Do, H.H.: Data cleaning: problems and current approaches. J. IEEE Data Eng. Bull. 23(4), 3–13 (2000)Google Scholar
- 10.Töpper, G., Knuth, M., Sack, H.: DBpedia ontology enrichment for inconsistency detection. In: I-SEMANTICS, pp. 33–40 (2012)Google Scholar