Abstract
Linked Data has experienced an accelerating growth since it was launched on 2006. While an increasing amount of RDF data is available on the web, errors also proliferate, thus the quality of Linked Data has drawn more and more public attention. Since the quality of data in some way affects the reliability and efficiency of web applications consuming Linked Data, demand for quality analysis of Linked Data becomes increasingly imperative. In this paper, we present some problems concerning the quality of Linked Data. These problems are discovered through our analysis on two cross-domain RDF datasets: DBpedia and Zhishi.me, both of which are based on automatic extraction of resources from existing encyclopedias. Some of the problems can be detected simply by SPARQL queries, while others cannot. For every problem listed in this paper, we present a method for the detection of it. Besides, we do experiments to demonstrate the validity of our methods.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: A nucleus for a web of open data. In: Aberer, K., et al. (eds.) ISWC/ASWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)
Bizer, C., Heath, T., Berners-Lee, T.: Linked data-the story so far. Int. J. Semantic Web Inf. Syst. 5(3), 1–22 (2009)
Fleischhacker, D., Völker, J.: Inductive learning of disjointness axioms. In: Meersman, R., et al. (eds.) OTM 2011, Part II. LNCS, vol. 7045, pp. 680–697. Springer, Heidelberg (2011)
Hogan, A., Harth, A., Passant, A., Decker, S., Polleres, A.: Weaving the pedantic web. In: LDOW (2010)
Hogan, A., Umbrich, J., Harth, A., Cyganiak, R., Polleres, A., Decker, S.: An empirical survey of linked data conformance. J. Web Sem. 14, 14–44 (2012)
Lehmann, J., Bühmann, L.: ORE-a tool for repairing and enriching knowledge bases. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part II. LNCS, vol. 6497, pp. 177–193. Springer, Heidelberg (2010)
Niu, X., Sun, X., Wang, H., Rong, S., Qi, G., Yu, Y.: Zhishi.me - weaving chinese linking open data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part II. LNCS, vol. 7032, pp. 205–220. Springer, Heidelberg (2011)
Péron, Y., Raimbault, F., Ménier, G., Marteau, P.F., et al.: On the detection of inconsistencies in rdf data sets and their correction at ontological level. In: ISWC (2011)
Rahm, E., Do, H.H.: Data cleaning: problems and current approaches. J. IEEE Data Eng. Bull. 23(4), 3–13 (2000)
Töpper, G., Knuth, M., Sack, H.: DBpedia ontology enrichment for inconsistency detection. In: I-SEMANTICS, pp. 33–40 (2012)
Völker, J., Niepert, M.: Statistical schema induction. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 124–138. Springer, Heidelberg (2011)
Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Management Inform. Systems 12(4), 5–33 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ma, Y., Qi, G. (2013). An Analysis of Data Quality in DBpedia and Zhishi.me. In: Qi, G., Tang, J., Du, J., Pan, J.Z., Yu, Y. (eds) Linked Data and Knowledge Graph. CSWS 2013. Communications in Computer and Information Science, vol 406. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54025-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-54025-7_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54024-0
Online ISBN: 978-3-642-54025-7
eBook Packages: Computer ScienceComputer Science (R0)