Assessing the Completeness Evolution of DBpedia: A Case Study

  • Subhi Issa
  • Pierre-Henri Paris
  • Fayçal HamdiEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10651)


RDF web datasets, thanks to their semantic richness, variety and fine granularity, are increasingly adopted by both researchers’ and business communities. However, as anyone can publish data, this leads to sparse and heterogeneous data descriptions with undeniably an impact on quality. Consequently, there is an increasing effort dedicated to Web data quality improvement. We are interested in data quality and precisely in completeness quality evolution over time. The paper presents a set of experiments aiming to analyze the evolution of completeness quality values over several versions of DBpedia.


  1. 1.
    Batini, C., Cappiello, C., Francalanci, C., Maurino, A.: Methodologies for data quality assessment and improvement. ACM Comput. Surv. (CSUR) 41(3), 16 (2009)CrossRefGoogle Scholar
  2. 2.
    Eastman, C.M., Jansen, B.J.: Coverage, relevance, and ranking: the impact of query operators on web search engine results. ACM Trans. Inf. Syst. (TOIS) 21(4), 383–411 (2003)CrossRefGoogle Scholar
  3. 3.
    Grahne, G., Zhu, J.: Efficiently using prefix-trees in mining frequent itemsets. In: Proceedings of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations, 19 December 2003, Melbourne, Florida, USA (2003)Google Scholar
  4. 4.
    Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Discov. 8(1), 53–87 (2004)CrossRefMathSciNetGoogle Scholar
  5. 5.
    Hartig, O.: Trustworthiness of data on the web. In: Proceedings of the STI Berlin & CSW PhD Workshop. Citeseer (2008)Google Scholar
  6. 6.
    Herzig, D.M., Tran, T.: Heterogeneous web data search using relevance-based on the fly data integration. In: Proceedings of the 21st World Wide Web Conference 2012, WWW 2012, Lyon, France, 16–20 April 2012, pp. 141–150. ACM (2012)Google Scholar
  7. 7.
    Hogan, A., Harth, A., Passant, A., Decker, S., Polleres, A.: Weaving the pedantic web. In: Proceedings of the WWW2010 Workshop on Linked Data on the Web, LDOW 2010, 27 April 2010, Raleigh, USA (2010)Google Scholar
  8. 8.
    Mendes, P.N., Mühleisen, H., Bizer, C.: Sieve: linked data quality assessment and fusion. In: Proceedings of the 2012 Joint EDBT/ICDT Workshops, pp. 116–123. ACM (2012)Google Scholar
  9. 9.
    Pipino, L.L., Lee, Y.W., Wang, R.Y.: Data quality assessment. Commun. ACM 45(4), 211–218 (2002)CrossRefGoogle Scholar
  10. 10.
    Schmachtenberg, M., Bizer, C., Paulheim, H.: Adoption of the linked data best practices in different topical domains. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 245–260. Springer, Cham (2014). Google Scholar
  11. 11.
    Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manag. Inf. Syst. 12, 5–33 (1996)CrossRefGoogle Scholar
  12. 12.
    Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality assessment for linked data: a survey. Sem. Web 7(1), 63–93 (2016)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.CEDRIC - Conservatoire National des Arts et MétiersParisFrance

Personalised recommendations