Skip to main content

Discovering Wikipedia Conventions Using DBpedia Properties

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9507))

Abstract

Wikipedia is a public and universal encyclopedia where contributors edit articles collaboratively. Wikipedia infoboxes and categories have been used by semantic technologies to create DBpedia, a knowledge base that semantically describes Wikipedia content and makes it publicly available on the Web. Semantic descriptions of DBpedia can be exploited not only for data retrieval, but also for identifying missing navigational paths in Wikipedia. Existing approaches have demonstrated that missing navigational paths are useful for the Wikipedia community, but their injection has to respect the Wikipedia convention. In this paper, we present a collaborative recommender system approach named BlueFinder, to enhance Wikipedia content with DBpedia properties. BlueFinder implements a supervised learning algorithm to predict the Wikipedia conventions used to represent similar connected pairs of articles; these predictions are used to recommend the best convention(s) to connect disconnected articles. We report on an exhaustive evaluation that shows three remarkable elements: (1) The evidence of a relevant information gap between DBpedia and Wikipedia; (2) Behavior and accuracy of the BlueFinder algorithm; and (3) Differences in Wikipedia conventions according to the specificity of the involved articles. BlueFinder assists Wikipedia contributors to add missing relations between articles, and consequently, it improves Wikipedia content.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.wikipedia.org.

  2. 2.

    A place could be a Country, Province, City, or State.

  3. 3.

    DBpedia of July 2013.

  4. 4.

    Charlie Aitken (footballer born 1942).

  5. 5.

    It must be read as from Edinburgh article, the user navigates through a link to the category Edinburgh then he or she navigates to People from Edinburgh category, and then to Charlie Aitken article.

  6. 6.

    http://en.wikipedia.org/wiki/Wikipedia:Conventions.

  7. 7.

    http://en.wikipedia.org/wiki/Boston.

  8. 8.

    http://en.wikipedia.org/wiki/Tim_Barsky.

  9. 9.

    http://en.wikipedia.org/wiki/Donna_Summer.

  10. 10.

    In this work, we use \(\beta =1000\).

References

  1. Lu, C., Stankovic, M., Laublet, P.: Desperately searching for travel offers? formulate better queries with some help from linked data. In: Gandon, F., Sabou, M., Sack, H., d’Amato, C., Cudré-Mauroux, P., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9088, pp. 621–636. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  2. Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web J. 6(2), 167–195 (2015)

    Google Scholar 

  3. Torres, D., Molli, P., Skaf-Molli, H., Díaz, A.: Improving wikipedia with DBpedia. In: Mille, A., Gandon, F.L., Misselis, J., Rabinovich, M., Staab, S. (eds.) WWW (Companion Volume), pp. 1107–1112. ACM (2012)

    Google Scholar 

  4. Pérez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3), 16:1–16:45 (2009)

    Article  Google Scholar 

  5. Landauer, T.K., Nachbar, D.: Selection from alphabetic and numeric menu trees using a touch screen: breadth, depth, and width. ACM SIGCHI Bull. 16(4), 73–78 (1985)

    Article  Google Scholar 

  6. Larson, K., Czerwinski, M.: Web page design: implications of memory, structure and scent for information retrieval. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 1998, pp. 25–32. Press/Addison-Wesley Publishing Co., ACM, New York (1998)

    Google Scholar 

  7. Otter, M., Johnson, H.: Lost in hyperspace: metrics and mental models. Interact. Comput. 13(1), 1–40 (2000)

    Article  Google Scholar 

  8. Torres, D., Molli, P., Skaf-Molli, H., Diaz, A.: From DBpedia to wikipedia: filling the gap by discovering wikipedia conventions. In: 2012 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2012 (2012)

    Google Scholar 

  9. Torres, D., Skaf-Molli, H., Molli, P., Diaz, A.: BlueFinder: recommending wikipedia links using DBpedia properties. In: ACM Web Science Conference 2013, WebSci 2013, Paris, France, May 2013

    Google Scholar 

  10. Wang, Y., Wang, H., Zhu, H., Yu, Y.: Exploit semantic information for category annotation recommendation in wikipedia. In: Kedad, Z., Lammari, N., Métais, E., Meziane, F., Rezgui, Y. (eds.) NLDB 2007. LNCS, vol. 4592, pp. 48–60. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  11. Mirizzi, R., Di Noia, T., Ragone, A., Ostuni, V.C., Di Sciascio, E.: Movie recommendation with DBpedia. In: IIR, pp. 101–112. Citeseer (2012)

    Google Scholar 

  12. Panchenko, A., Adeykin, S., Romanov, A., Romanov, P.: Extraction of semantic relations between concepts with knn algorithms on wikipedia. In: Proceedings of Concept Discovery in Unstructured Data Workshop (CDUD) of International Conference On Formal Concept Analysis, pp. 78–88 (2012)

    Google Scholar 

  13. Singer, P., Niebler, T., Strohmaier, M., Hotho, A.: Computing semantic relatedness from human navigational paths: a case study on wikipedia. Int. J. Seman. Web Inf. Syst. (IJSWIS) 9(4), 41–70 (2013)

    Article  Google Scholar 

  14. Di Noia, T., Mirizzi, R., Ostuni, V.C., Romito, D., Zanker, M.: Linked open data to support content-based recommender systems. In: 8th International Conference on Semantic Systems (I-SEMANTICS 2012), ICP, ACM Press (2012)

    Google Scholar 

  15. Pereira Nunes, B., Dietze, S., Casanova, M.A., Kawase, R., Fetahu, B., Nejdl, W.: Combining a co-occurrence-based and a semantic measure for entity linking. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 548–562. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  16. Adafre, S.F., de Rijke, M.: Discovering missing links in wikipedia. In: Proceedings of the 3rd International Workshop on Link Discovery, LinkKDD 2005, pp. 90–97. ACM, New York (2005)

    Google Scholar 

  17. Sunercan, O., Birturk, A.: Wikipedia missing link discovery: a comparative study. In: AAAI Spring Symposium: Linked Data Meets Artificial Intelligence, AAAI (2010)

    Google Scholar 

  18. Hoffmann, R., Amershi, S., Patel, K., Wu, F., Fogarty, J., Weld, D.S.: Amplifying community content creation with mixed initiative information extraction. In: Proceedings of the 27th International Conference on Human Factors in Computing Systems, CHI 2009, pp. 1849–1858. ACM, New York (2009)

    Google Scholar 

  19. Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, pp. 697–706. ACM, New York (2007)

    Google Scholar 

  20. Alkhateeb, F., Baget, J.F., Euzenat, J.: Extending SPARQL with regular expression patterns (for querying RDF). Web Seman. Sci. Serv. Agents World Wide Web 7(2), 57–73 (2011)

    Article  Google Scholar 

  21. Abiteboul, S., Vianu, V.: Regular path queries with constraints. In: Proceedings of the Sixteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS 1997, pp. 122–133. ACM, New York (1997)

    Google Scholar 

  22. Arenas, M., Conca, S., Pérez, J.: Counting beyond a yottabyte, or how SPARQL 1.1 property paths will prevent adoption of the standard. In: Proceedings of the 21st International Conference on World Wide Web, WWW 2012, pp. 629–638. ACM, New York (2012)

    Google Scholar 

  23. Adomavicius, G., Tuzhilin, A.: Towards the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005)

    Article  Google Scholar 

  24. Jaccard, P.: Nouvelles recherches sur la distribution florale. Bull. de la Sociète Vaudense des Sciences Naturelles 44, 223–270 (1908)

    Google Scholar 

  25. Lu, W., Shen, Y., Chen, S., Ooi, B.: Efficient processing of k nearest neighbor joins using mapreduce. Proc. VLDB Endowment 5(10), 1016–1027 (2012)

    Article  Google Scholar 

  26. Deshpande, M., Karypis, G.: Item-based top-n recommendation algorithms. ACM Trans. Inf. Syst. (TOIS) 22(1), 143–177 (2004)

    Article  Google Scholar 

  27. O’Sullivan, D., Smyth, B., Wilson, D.C., Mcdonald, K., Smeaton, A.: Improving the quality of the personalized electronic program guide. User Model. User-Adap. Inter. 14(1), 5–36 (2004)

    Article  Google Scholar 

  28. Fleder, D.M., Hosanagar, K.: Recommender systems and their impact on sales diversity. In: Proceedings of the 8th ACM Conference on Electronic Commerce, pp. 192–199. ACM (2007)

    Google Scholar 

  29. Shani, G., Gunawardana, A.: Evaluating recommendation systems. In: Ricci, F., Rokach, L., Shapira, B., Kantor, P.B. (eds.) Recommender Systems Handbook, pp. 257–297. Springer, US (2011)

    Chapter  Google Scholar 

Download references

Acknowledgements

This work is supported by the French National Research agency (ANR) through the KolFlow project (code: ANR-10-CONTINT-025), part of the CONTINT research program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Diego Torres .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Torres, D., Skaf-Molli, H., Molli, P., Díaz, A. (2016). Discovering Wikipedia Conventions Using DBpedia Properties. In: Molli, P., Breslin, J., Vidal, ME. (eds) Semantic Web Collaborative Spaces. SWCS SWCS 2014 2013. Lecture Notes in Computer Science(), vol 9507. Springer, Cham. https://doi.org/10.1007/978-3-319-32667-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-32667-2_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-32666-5

  • Online ISBN: 978-3-319-32667-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics