Skip to main content

Wikipedia as a Source of Ontological Knowledge: State of the Art and Application

  • Chapter
Intelligent Networking, Collaborative Systems and Applications

Part of the book series: Studies in Computational Intelligence ((SCI,volume 329))

Abstract

This chapter motivates that Wikipedia can be used as a source of knowledge for creating semantic enabled applications, and consists of two parts. First, we provide an overview over different research fields which attempt to extract knowledge encoded by humans inside Wikipedia. The extracted knowledge can then be used for creating a new generation of intelligent applications based on the collaborative character of Wikipedia, rather than on domain ontologies which require the intervention of knowledge engineers and domain experts. Second, as a proof of concept, we describe an application whose intelligent behavior is achieved by using Wikipedia knowledge for automatic annotation and representation of multimedia presentations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adafre, S.F., Jijkoun, V., de Rijke, M.: Fact discovery in wikipedia. In: WI 2007: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, pp. 177–183. IEEE Computer Society, Washington (2007), http://dx.doi.org/10.1109/WI.2007.57

    Chapter  Google Scholar 

  2. Adafre, S.F., de Rijke, M.: Discovering missing links in wikipedia. In: LinkKDD 2005: Proceedings of the 3rd International Workshop on Link Discovery, pp. 90–97. ACM, New York (2005)

    Chapter  Google Scholar 

  3. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.G.: Dbpedia: A nucleus for a web of open data. In: ISWC/ASWC, pp. 722–735 (2007)

    Google Scholar 

  4. Auer, S., Lehmann, J.: What have innsbruck and leipzig in common? extracting semantics from wiki content. In: Franconi, E., Kifer, M., May, W. (eds.) ESWC 2007. LNCS, vol. 4519, pp. 503–517. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  5. Chernov, S., Iofciu, T., Nejdl, W., Zhou, X.: Extracting semantic relationships between wikipedia categories. In: 1st Workshop on Semantic Wikis (2006)

    Google Scholar 

  6. Ciaramita, M., Altun, Y.: Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pp. 594–602. Association for Computational Linguistics, Sydney (2006), http://www.aclweb.org/anthology/W/W06/W06-1670

    Chapter  Google Scholar 

  7. Cucerzan, S.: Large-scale named entity disambiguation based on wikipedia data. In: EMNLP 2007: Empirical Methods in Natural Language Processing, Prague, Czech Republic, June 28-30, pp. 708–716 (2007), http://acl.ldc.upenn.edu/D/D07/D07-1074.pdf

  8. Culotta, A., McCallum, A., Betz, J.: Integrating probabilistic extraction models and data mining to discover relations and patterns in text. In: Proceedings of the main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 296–303. Association for Computational Linguistics, Morristown (2006), http://dx.doi.org/10.3115/1220835.1220873

    Chapter  Google Scholar 

  9. Ebersbach, A., Glaser, M., Heigl, R.: Wiki: Web Collaboration. Springer, Heidelberg (2005)

    Google Scholar 

  10. Fensel, D.: Ontologies: A Silver Bullet for Knowledge Management and Electronic Commerce. Springer, New York (2003)

    Google Scholar 

  11. Fields, K.: Ontologies, categories, folksonomies: an organised language of sound. Org. Sound 12(2), 101–111 (2007), http://dx.doi.org/10.1017/S135577180700177X

    Article  Google Scholar 

  12. Fogarolli, A.: Word sense disambiguation based on wikipedia link structure. In: IEEE ICSC 2009 (2009)

    Google Scholar 

  13. Fogarolli, A., Ronchetti, M.: Intelligent mining and indexing of multi-language e-learning material. In: Tsihrintzis, G., et al. (eds.) 1st International Symposium on Intelligent Interactive Multimedia Systems and Services, KES IIMS 2008. SCI, vol. New Directions in Intelligent Interactive Multimedia, pp. 395–404. Springer, Heidelberg (2008)

    Google Scholar 

  14. Gabrilovich, E., Markovitch, S.: Overcoming the brittleness bottleneck using Wikipedia: Enhancing text categorization with encyclopedic knowledge. In: Proceedings of the Twenty-First National Conference on Artificial Intelligence, Boston, MA (2006)

    Google Scholar 

  15. Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence, pp. 6–12 (2007)

    Google Scholar 

  16. Cui, G., Lu, Q., Li, W., Chen, Y.: Corpus exploitation from wikipedia for ontology construction. In: E.L.R.A (ELRA) (ed.) Proceedings of the Sixth International Language Resources and Evaluation (LREC 2008), Marrakech, Morocco (2008)

    Google Scholar 

  17. Giles, J.: Internet encyclopaedias go head to head. Nature 438(7070), 900–901 (2005)

    Article  Google Scholar 

  18. Klein, G.O., Smith, B.: Concept systems and ontologies. Discussion between realist philosophers and ISO/CEN experts concerning the standards addressing ”concepts” and related terms (2005)

    Google Scholar 

  19. Gruber, T.: Tagontology - a way to agree on the semantics of tagging data (2005), http://tomgruber.org/writing/tagontology-tagcamp-talk.pdf

  20. Hepp, M., Siorpaes, K., Bachlechner, D.: Harvesting wiki consensus: Using wikipedia entries as vocabulary for knowledge management. IEEE Internet Computing 11(5), 54–65 (2007), doi:10.1109/MIC.2007.110

    Article  Google Scholar 

  21. Hu, M., Lim, E.P., Sun, A., Lauw, H.W., Vuong, B.Q.: Measuring article quality in wikipedia: models and evaluation. In: CIKM 2007: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 243–252. ACM, New York (2007), http://doi.acm.org/10.1145/1321440.1321476

    Chapter  Google Scholar 

  22. Janik, M., Kochut, K.J.: Wikipedia in action: Ontological knowledge in text categorization. ICSC 0, 268–275 (2008), http://doi.ieeecomputersociety.org/10.1109/ICSC.2008.53

    Google Scholar 

  23. Atserias, J., Zaragoza, H., Ciaramita, M., Attardi, G.: Semantically annotated snapshot of the english wikipedia. In: E.L.R.A (ELRA) (ed.) Proceedings of the Sixth International Language Resources and Evaluation (LREC 2008), Marrakech, Morocco (2008)

    Google Scholar 

  24. Kamps, J., Koolen, M.: The importance of link evidence in wikipedia. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 270–282. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  25. Krizhanovsky, A.: Synonym search in wikipedia: Synarcher. arxiv.org http://arxiv.org/abs/cs/0606097v1 ; Search for synomyms in Wikipedia using hyperlinks and categories

  26. Lankes, R.D., Silverstein, J., Nicholson, S., Marshall, T.: Participatory networks the library as conversation. Information Research 12(4) (2007), http://iis.syr.edu/projects/PNOpen/ParticiaptoryNetworks.pdf

  27. Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In: SIGDOC 1986: Proceedings of the 5th Annual International Conference on Systems Documentation, pp. 24–26. ACM, New York (1986), http://doi.acm.org/10.1145/318723.318728

    Chapter  Google Scholar 

  28. Liu, S., Liu, F., Yu, C., Meng, W.: An effective approach to document retrieval via utilizing wordnet and recognizing phrases. In: SIGIR 2004: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 266–272. ACM, New York (2004), http://doi.acm.org/10.1145/1008992.1009039

    Google Scholar 

  29. Ramos, M.A., Rambow, O., Wanner, L.: Using semantically annotated corpora to build collocation resources. In: E.L.R.A (ELRA) (ed.) Proceedings of the Sixth International Language Resources and Evaluation (LREC 2008), Marrakech, Morocco (2008)

    Google Scholar 

  30. Mihalcea, R.: Using wikipedia for automatic word sense disambiguation. In: Proceedings of NAACL HLT 2007, pp. 196–203 (2007), http://www.cs.unt.edu/~rada/papers/mihalcea.naacl07.pdf

  31. Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: CIKM 2007: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 233–242. ACM, New York (2007)

    Chapter  Google Scholar 

  32. Milne, D.: Computing semantic relatedness using wikipedia link structure. In: New Zealand Computer Science Research Student Conference (2007)

    Google Scholar 

  33. Milne, D., Medelyan, O., Witten, I.H.: Mining domain-specific thesauri from wikipedia: A case study. In: WI 2006: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 442–448. IEEE Computer Society, Washington (2006), http://dx.doi.org/10.1109/WI.2006.119

    Chapter  Google Scholar 

  34. Nguyen, D.P.T., Matsuo, Y., Ishizuka, M.: Relation extraction from wikipedia using subtree mining. In: AAAI, pp. 1414–1420. AAAI Press, Menlo Park (2007)

    Google Scholar 

  35. Noruzi, A.: Folksonomies (un)controlled vocabulary? Knowledge Organization 33(4), 199–203 (2006), http://noruzi.blogspot.com/2007/07/folksonomies-uncontrolled-vocabulary.html

    Google Scholar 

  36. Obrst, L.: Ontologies for semantically interoperable systems. In: CIKM 2003: Proceedings of the Twelfth International Conference on Information and Knowledge Management, pp. 366–369. ACM Press, New York (2003), http://doi.acm.org/10.1145/956863.956932

    Chapter  Google Scholar 

  37. Ollivier, Y., Senellart, P.: Finding related pages using Green measures: An illustration with Wikipedia. In: Proc. AAAI, Vancouver, Canada, pp. 1427–1433 (2007)

    Google Scholar 

  38. Pask, G.: Conversation, cognition and learning: A cybernetic theory and methodology. Elsevier, Amsterdam (1975), http://www.amazon.ca/exec/obidos/redirect?tag=citeulike09-20&path=ASIN/0444411933

    Google Scholar 

  39. Ponzetto, S.: Creating a knowledge base from a collaboratively generated encyclopedia. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics Doctoral Consortium, Rochester, N.Y., pp. 9–12 (2007)

    Google Scholar 

  40. Ponzetto, S., Strube, M.: Deriving a large scale taxonomy from wikipedia. In: Proceedings of the 22nd National Conference on Artificial Intelligence (AAAI 2007), Vancouver, B.C., pp. 1440–1447 (2007)

    Google Scholar 

  41. Roth, M., im Walde, S.S.: Corpus co-occurrence, dictionary and wikipedia entries as resources for semantic relatedness information. In: E.L.R.A (ELRA) (ed.) Proceedings of the Sixth International Language Resources and Evaluation (LREC 2008), Marrakech, Morocco (2008)

    Google Scholar 

  42. Ruiz-Casado, M., Alfonseca, E., Castells, P.: From wikipedia to semantic relationships: a semi-automated annotation approach. In: SemWiki (2006)

    Google Scholar 

  43. Schaffert, S.: Ikewiki: A semantic wiki for collaborative knowledge management. In: 15th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises, WETICE 2006, pp. 388–396 (2006)

    Google Scholar 

  44. Schonhofen, P.: Identifying document topics using the wikipedia category network. In: WI 2006: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 456–462. IEEE Computer Society, Washington (2006)

    Chapter  Google Scholar 

  45. Siorpaes, K., Hepp, M.: Ontogame: Weaving the semantic web by online games. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 751–766. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  46. Snoek, C., Worring, M.: Multimodal video indexing: A review of the state-of-the-art. In: Multimedia Tools and Applications, vol. 25, pp. 5–35 (2005)

    Google Scholar 

  47. Strube, M., Ponzetto, S.: WikiRelate! Computing semantic relatedness using Wikipedia. In: Proceedings of the 21st National Conference on ArtificialIntelligence (AAAI 2006), Boston, Mass., pp. 1419–1424 (2006)

    Google Scholar 

  48. Suchanek, F., Kasneci, G., Weikum, G.: Yago: A large ontology from wikipedia and wordnet. Research Report MPI-I-2007-5-003, Max-Planck-Institut für Informatik, Stuhlsatzenhausweg 85, 66123 Saarbrücken, Germany (2007)

    Google Scholar 

  49. Suh, S., Halpin, H., Klein, E.: Extracting common sense knowledge from wikipedia. In: Proc. of the ISWC 2006 Workshop on Web Content Mining with Human Language technology (2006), http://orestes.ii.uam.es/workshop/22.pdf

  50. Syed, Z., Finin, T., Joshi, A.: Wikipedia as an ontology for describing documents. In: Proceedings of the Second International Conference on Weblogs and Social Media. AAAI Press, Menlo Park (2008)

    Google Scholar 

  51. Thomas, C., Sheth, A.P.: Semantic convergence of wikipedia articles. In: WI 2007: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, pp. 600–606. IEEE Computer Society, Washington (2007), http://dx.doi.org/10.1109/WI.2007.93

    Chapter  Google Scholar 

  52. Twidale, B.S.M.B.: Assessing information quality of a community-based encyclopedia. In: Proceedings of the International Conference on Information Quality, pp. 442–454 (2005)

    Google Scholar 

  53. Uren, V.S., Cimiano, P., Iria, J., Handschuh, S., Vargas-Vera, M., Motta, E., Ciravegna, F.: Semantic annotation for knowledge management: Requirements and a survey of the state of the art. J. Web Sem. 4(1), 14–28 (2006)

    Google Scholar 

  54. Vercoustre, A.M., Thom, J.A., Pehcevski, J.: Entity ranking in wikipedia. In: SAC 2008: Proceedings of the 2008 ACM Symposium on Applied computing, pp. 1101–1106. ACM, New York (2008), http://doi.acm.org/10.1145/1363686.1363943

    Chapter  Google Scholar 

  55. Völkel, M., Krötzsch, M., Vrandecic, D., Haller, H., Studer, R.: Semantic wikipedia. In: Proceedings of the 15th International Conference on World Wide Web, WWW 2006, Edinburgh, Scotland, May 23-26 (2006), http://www.aifb.uni-karlsruhe.de/WBS/hha/papers/SemanticWikipedia.pdf

  56. Voss, J.: Measuring wikipedia. In: Proceedings International Conference of the International Society for Scientometrics and Informetrics: 10 th (2005), http://eprints.rclis.org/archive/00003610/

  57. Voss, J.: Collaborative thesaurus tagging the wikipedia way (2006), http://arxiv.org/abs/cs.IR/0604036

  58. Wang, G., Yu, Y., Zhu, H.: Pore: Positive-only relation extraction from wikipedia text. In: Aberer, K., Choi, K.S., Noy, N., Allemang, D., Lee, K.I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 575–588. Springer, Heidelberg (2007), http://iswc2007.semanticweb.org/papers/575.pdf

    Google Scholar 

  59. Wang, J.Z., Boujemaa, N., Bimbo, A.D., Geman, D., Hauptmann, A.G., Tesić, J.: Diversity in multimedia information retrieval research. In: MIR 2006: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval, pp. 5–12. ACM, New York (2006), http://doi.acm.org/10.1145/1178677.1178681

    Chapter  Google Scholar 

  60. Wang, J.Z., Boujemaa, N., Chen, Y.: High diversity transforms multimedia information retrieval into a cross-cutting field: report on the 8th workshop on multimedia information retrieval. SIGMOD Rec. 36(1), 57–59 (2007), http://doi.acm.org/10.1145/1276301.1276315

    Article  MATH  Google Scholar 

  61. Watanabe, Y., Asahara, M., Matsumoto, Y.: A graph-based approach to named entity categorization in Wikipedia using conditional random fields. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 649–657. Association for Computational Linguistics, Prague (2007), http://www.aclweb.org/anthology/D/D07/D07-1068

    Google Scholar 

  62. Weber, N., Buitelaar, P.: Web-based ontology learning with isolde. In: Proc. of ISWC 2006 Workshop on Web Content Mining with Human Language Technologies (2006), http://orestes.ii.uam.es/workshop/4.pdf

  63. Wu, F., Weld, D.S.: Autonomously semantifying wikipedia. In: CIKM 2007: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pp. 41–50. ACM, New York (2007), http://portal.acm.org/citation.cfm?id=1321440.1321449 , doi:10.1145/1321440.1321449

    Chapter  Google Scholar 

  64. Yu, J., Thom, J.A., Tam, A.: Ontology evaluation using wikipedia categories for browsing. In: CIKM 2007: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 223–232. ACM, New York (2007), http://doi.acm.org/10.1145/1321440.1321474

    Chapter  Google Scholar 

  65. Zesch, T., Gurevych, I.: Analysis of the wikipedia category graph for nlp applications. In: Proc. of the TextGraphs-2 Workshop (2007), http://acl.ldc.upenn.edu/W/W07/W07-0201.pdf

  66. Zesch, T., Gurevych, I., Mühlhäuser, M.: Analyzing and accessing wikipedia as a lexical semantic resource. In: Biannual Conference of the Society for Computational Linguistics and Language Technology (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Fogarolli, A. (2010). Wikipedia as a Source of Ontological Knowledge: State of the Art and Application. In: Caballé, S., Xhafa, F., Abraham, A. (eds) Intelligent Networking, Collaborative Systems and Applications. Studies in Computational Intelligence, vol 329. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16793-5_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-16793-5_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-16792-8

  • Online ISBN: 978-3-642-16793-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics