Skip to main content
Log in

Coupling semantic and statistical techniques for dynamically enriching web ontologies

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

With the development of the Semantic Web technology, the use of ontologies to store and retrieve information covering several domains has increased. However, very few ontologies are able to cope with the ever-growing need of frequently updated semantic information or specific user requirements in specialized domains. As a result, a critical issue is related to the unavailability of relational information between concepts, also coined missing background knowledge. One solution to address this issue relies on the manual enrichment of ontologies by domain experts which is however a time consuming and costly process, hence the need for dynamic ontology enrichment. In this paper we present an automatic coupled statistical/semantic framework for dynamically enriching large-scale generic ontologies from the World Wide Web. Using the massive amount of information encoded in texts on the Web as a corpus, missing background knowledge can therefore be discovered through a combination of semantic relatedness measures and pattern acquisition techniques and subsequently exploited. The benefits of our approach are: (i) proposing the dynamic enrichment of large-scale generic ontologies with missing background knowledge, and thus, enabling the reuse of such knowledge, (ii) dealing with the issue of costly ontological manual enrichment by domain experts. Experimental results in a precision-based evaluation setting demonstrate the effectiveness of the proposed techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://www.mpi-inf.mpg.de/yago-naga/yago/

  2. http://www.cyc.com/

  3. http://www.ontologyportal.or

  4. http://www.lt4el.eu

  5. http://projects.csail.mit.edu/jwi/api/index.html

  6. http://www.itsmarc.com/crsbeta/mergedProjects/scmshelf/scmshelf/g_220_corporate_bodies_shelf.htm

References

  • Buitelaar, P., et al. (2003). A multi-layered, XML-based approach to the integration of linguistic and semantic annotations. In Proceedings of EACL 2003 workshop on language technology and the Semantic Web. Budapest, Hungary

  • Cilibrasi, R.L., & Vitanyi, P.M.B. (2007). The Google similarity distance. IEEE Transactions on Knowledge and Data Engineering, 19(3), 370–383.

    Article  Google Scholar 

  • Cimiano, P., Hotho, A., Staab, S. (2005). Learning concept hierarchies from text corpora using formal concept analysis. Journal of Artificial Intelligence Research, 24(1), 305–339.

    MATH  Google Scholar 

  • Croft, W.B., Metzler, D., Strohman, T. (2010). Search engines information retrieval in practice. Addison Wesley.

  • Cunningham, H., et al. (2002). GATE: A framework and graphical development environment for robust NLP tools and applications. In 40th anniversary meeting of the association for computational linguistics. Phil, USA.

  • Faure, D., & Poibeau, T. (2000). First experiments of using semantic knowledge learned by ASIUM for information extraction task using INTEX. In Proceedings of ECAI workshop on ontology learning.

  • Giunchiglia, F., Shvaiko, P., Yatskevich, M. (2004). S-Match: An algorithm and an implementation of semantic matching. In C. Bussler, et al. (Eds.), The Semantic Web: Research and applications (pp. 61–75). Berlin/Heidelberg: Springer.

    Chapter  Google Scholar 

  • Hahn, U., & Marko, K. (2002). Ontology and lexicon evolution by text understanding. In Proceedings of the ECAI 2002 workshop on machine learning and natural language processing for ontology engineering (OLT’2002).

  • Hearst, M.A. (1992). Automatic acquisition of hyponyms from large text corpora. In Proc. of COLING.

  • Maedche, A., & Staab, S. (2000). Discovering conceptual relations from text. In 14th European conference on artificial intelligence (ECAI’2000). Berlin, Germany.

  • Maedche, A., & Staab, S. (2001). Ontology learning for the Semantic Web. IEEE Intelligent Systems, 16(2), 72–79.

    Article  Google Scholar 

  • Monachesi, P., & Markus, T. (2010). Using social media for ontology enrichment. In L. Aroyo, et al. (Eds.), The Semantic Web: Research and applications (pp. 166–180). Berlin/Heidelberg: Springer.

    Chapter  Google Scholar 

  • Ruiz-Casado, M., Alfonseca, E., Castells, P. (2007). Automatising the learning of lexical patterns: an application to the enrichment of WordNet by extracting semantic relationships from Wikipedia. Data & Knowledge Engineering, 61(3), 484–499.

    Article  Google Scholar 

  • Schutz, A., & Buitelaar, P. (2005). RelExt: A tool for relation extraction from text in ontology extension the Semantic Web. In Y. Gil, et al. (Eds.), ISWC 2005 (pp. 593–606).

  • Shamsfard, M., & Barforoush, A.A. (2003). The state of the art in ontology learning: a framework for comparison. Knowledge Engineering Review, 18(4), 293–316.

    Article  Google Scholar 

  • Trojahn, C., et al. (2008). A cooperative approach for composite ontology mapping. In S. Spaccapietra (Ed.), Journal on data semantics X (pp. 237–263). Berlin/Heidelberg: Springer.

    Chapter  Google Scholar 

  • Velardi, P., Fabriani, P., Missikoff, M. (2001). Using text processing techniques to automatically enrich a domain ontology. In Proceedings of the international conference on formal ontology in information systems (Vol. 2001). Ogunquit, Maine, U.S.A.: ACM.

    Google Scholar 

  • Wang, Q., Gauch, S., Luong, H. (2010). Ontology concept enrichment via text mining. In IADIS international conference on internet technologies & society (pp. 147–154).

  • Yamaguchi, T. (2001). Acquiring conceptual relations from domain-specific texts. In Proceedings of the IJCAI 2001, second workshop on ontology learning (OL’2001).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammed Belkhatir.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Maree, M., Belkhatir, M. Coupling semantic and statistical techniques for dynamically enriching web ontologies. J Intell Inf Syst 40, 455–478 (2013). https://doi.org/10.1007/s10844-012-0233-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-012-0233-4

Keywords

Navigation