Skip to main content

Completeness and Reliability of Wikipedia Infoboxes in Various Languages

  • Conference paper
  • First Online:
Business Information Systems Workshops (BIS 2017)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 303))

Included in the following conference series:

Abstract

Despite its popularity, Wikipedia is often criticized for poor information quality. Currently this online knowledge base consist over 45 million articles in almost 300 various languages. Articles in Wikipedia often includes special tables which present shortly important information about persons, places, products, organizations and other subjects. This table is usually placed in a visible part of the article and Wikipedia community called it “infobox”. These infoboxes contains information in a structured form that allows automatically enrich popular public databases such as DBpedia. Wikipedia users can edit infoboxes in different languages independently. So, quality of information about the same thing may differ between various language versions. This article will examine the completeness and reliability of infoboxes about different topics in seven language versions of Wikipedia: English, German, French, Polish, Russian, Ukrainian and Belarussian. The results of the study can be used for automatic assessing and improving the quality of information in Wikipedia as well as in other public knowledge bases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.alexa.com/siteinfo/wikipedia.org.

  2. 2.

    https://meta.wikimedia.org/wiki/List_of_Wikipedias.

  3. 3.

    https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Tabular_Data.

  4. 4.

    https://www.wikidata.org.

  5. 5.

    http://wiki.dbpedia.org/.

  6. 6.

    https://github.com/dbpedia/extraction-framework.

  7. 7.

    http://mappings.dbpedia.org.

  8. 8.

    http://wikirank.net.

  9. 9.

    https://pl.wikipedia.org/wiki/StarCraft_II:_Wings_of_Liberty.

References

  1. Lewoniewski, W., Węcel, K., Abramowicz, W.: Quality and importance of Wikipedia articles in different languages. In: Dregvaite, G., Damasevicius, R. (eds.) ICIST 2016. CCIS, vol. 639, pp. 613–624. Springer, Cham (2016). doi:10.1007/978-3-319-46254-7_50

    Chapter  Google Scholar 

  2. Warncke-Wang, M., Cosley, D., Riedl, J.: Tell me more: an actionable quality model for Wikipedia. In: Proceedings of the 9th International Symposium on Open Collaboration, p. 8. ACM (2013)

    Google Scholar 

  3. Węcel, K., Lewoniewski, W.: Modelling the quality of attributes in Wikipedia infoboxes. In: Abramowicz, W. (ed.) BIS 2015. LNBIP, vol. 228, pp. 308–320. Springer, Cham (2015). doi:10.1007/978-3-319-26762-3_27

    Chapter  Google Scholar 

  4. Suzuki, Y., Nakamura, S.: Assessing the quality of Wikipedia editors through crowdsourcing. In: Proceedings of the 25th International Conference Companion on World Wide Web, pp. 1001–1006. International World Wide Web Conferences Steering Committee (2016)

    Google Scholar 

  5. Ingawale, M., Dutta, A., Roy, R., Seetharaman, P.: Network analysis of user generated content quality in Wikipedia. Online Inf. Rev. 37(4), 602–619 (2013)

    Article  Google Scholar 

  6. Khairova, N., Lewoniewski, W., Węcel, K.: Estimating the quality of articles in Russian Wikipedia using the logical-linguistic model of fact extraction. In: Abramowicz, W. (ed.) BIS 2017. LNBIP, vol. 288, pp. 28–40. Springer, Cham (2017). doi:10.1007/978-3-319-59336-4_3

    Chapter  Google Scholar 

  7. Kontokostas, D., Westphal, P., Auer, S., Hellmann, S., Lehmann, J., Cornelissen, R., Zaveri, A.: Test-driven evaluation of linked data quality. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 747–758. ACM (2014)

    Google Scholar 

  8. Debattista, J., Auer, S., Lange, C.: Luzzu - a framework for linked data quality assessment. In: 2016 IEEE Tenth International Conference on Semantic Computing (ICSC), pp. 124–131. IEEE (2016)

    Google Scholar 

  9. Mendes, P.N., Mühleisen, H., Bizer, C.: Sieve: linked data quality assessment and fusion. In: Proceedings of the 2012 Joint EDBT/ICDT Workshops, EDBT-ICDT 2012, New York, NY, USA, pp. 116–123. ACM (2012)

    Google Scholar 

  10. Paulheim, H., Bizer, C.: Improving the quality of linked data using statistical distributions. Int. J. Semant. Web Inf. Syst. (IJSWIS) 10(2), 63–86 (2014)

    Article  Google Scholar 

  11. Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality assessment for linked data: a survey. Semant. Web 7(1), 63–93 (2016)

    Article  Google Scholar 

  12. Lewoniewski, W., Węcel, K., Abramowicz, W.: Analysis of references across Wikipedia languages. In: The 23rd International Conference on Information and Software Technologies (2017). https://link.springer.com/chapter/10.1007/978-3-319-67642-5_47

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Włodzimierz Lewoniewski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Lewoniewski, W. (2017). Completeness and Reliability of Wikipedia Infoboxes in Various Languages. In: Abramowicz, W. (eds) Business Information Systems Workshops. BIS 2017. Lecture Notes in Business Information Processing, vol 303. Springer, Cham. https://doi.org/10.1007/978-3-319-69023-0_25

Download citation

Publish with us

Policies and ethics