Abstract
We propose OntoGain, a system for unsupervised ontology acquisition from unstructured text which relies on multi-word term extraction. For the acquisition of taxonomic relations, we exploit inherent multi-word terms’ lexical information in a comparative implementation of agglomerative hierarchical clustering and formal concept analysis methods. For the detection of non-taxonomic relations, we comparatively investigate in OntoGain an association rules based algorithm and a probabilistic algorithm. The OntoGain system allows for transformation of the derived ontology into standard OWL statements. OntoGain results are compared to both hand-crafted ontologies, as well as to a state-of-the art system, in two different domains: the medical and computer science domains.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Pinto, H., Martins, J.: Ontologies: How can They be Built? Knowledge and Information Systems 6(4), 441–464 (2004)
Suchanek, F.M., Sozio, M., Weikum, G.: SOFIE: A Self-Organizing Framework for Information Extraction. In: Proc. of the 18th Intern. World Wide Web Conf. (WWW 2009), Madrid, Spain, pp. 631–640. ACM Press, New York (2009)
Pantel, P., Pennacchiotti, M.: Automatically Harvesting and Ontologizing Semantic Relations. In: Proc. of the 2008 Conf. on Ontology Learning and Population: Bridging the Gap between Text and Knowledge, pp. 171–195. IOS Press, Amsterdam (2008)
Velardi, P., Navigli, R., Cucchiarelli, A., Neri, F.: Evaluation of OntoLearn, a Methodology for Automatic Learning of Ontologies. In: Buitelaar, P., Cimmiano, P., Magnini, B. (eds.) Ontology Learning from Text: Methods, Evaluation and Applications, pp. 569–572. IOS Press, Amsterdam (2005)
Buitelaar, P., Cimiano, P., Frank, A., Racioppa, S.: SOBA: SmartWeb Ontology-based Annotation. In: Proc. of the Demo Session at the Intern. Semantic Web Conference (ISWC), Athens GA, USA (November 2006)
Cimiano, P., Hotho, A., Staab, S.: Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis. Journal of Artificial Intelligence Research (JAIR) 24, 305–339 (2005)
Haav, H.M.: An application of inductive concept analysis to construction of domain-specific ontologies. In: Brandenburg University of Technology at Cottbus, pp. 63–67 (2003)
Maedche, A., Staab, S.: Discovering Conceptual Relations from Text. In: Proc. of the 14th European Conf. on Artificial Intelligence (ECAI 2000), August 2000, pp. 321–325. IOS Press, Amsterdam (2000)
Ciaramita, M., Gangemi, A., Ratsch, E., Saric, J., Rojas, I.: Unsupervised Learning of Semantic Relations for Molecular Biology Ontologies. In: Buitelaar, P., Cimiano, P. (eds.) Ontology Learning and Population: Bridging the Gap between Text and Knowledge, pp. 99–104. IOS Press, Amsterdam (2008)
Soderland, S., Mandhani, B.: Moving from Textual Relations to Ontologized Relations. In: Proc. of the 2007 AAAI Spring Symposium on Machine Reading, pp. 85–90. AAAI Press, Menlo Park (2007)
Cimiano, P., Völker, J.: Text2Onto - A Framework for Ontology Learning and Data-driven Change Discovery. In: Montoyo, A., Muńoz, R., Métais, E. (eds.) NLDB 2005. LNCS, vol. 3513, pp. 227–238. Springer, Heidelberg (2005)
Buitelaar, P., Cimiano, P., Magnini, B.: Ontology Learning from Text: Methods, Evaluation and Applications. IOS Press, Amsterdam (2005)
Frantzi, K., Ananiadou, S., Mima, H.: Automatic Recognition of Multi-Word Terms: The C-Value/NC-Value Method. Intern. Journal of Digital Libraries 3(2), 117–132 (2000)
Witschel, H.: Terminology Extraction and Automatic Indexing – Comparison and Qualitative Evaluation of Methods. In: Proc. of Terminology and Knowledge Engineering, TKE (2005)
Cimiano, P.: Ontology Learning and Population from Text: Algorithms, Evaluation and Applications. Springer, Heidelberg (2006)
Brank, J., Grobelnik, M., Mladenic, D.: A Survey of Ontology Evaluation Techniques. In: Proc. of the Conf. on Data Mining and Data Warehouses (SiKDD 2005), Ljubljana, Slovenia (October 2005)
Kavalec, M., Maedche, A., Svátek, V.: Discovery of Lexical Entries for Non-taxonomic Relations in Ontology Learning. In: Van Emde Boas, P., Pokorný, J., Bieliková, M., Štuller, J. (eds.) SOFSEM 2004. LNCS, vol. 2932, pp. 249–256. Springer, Heidelberg (2004)
Nenadic, G., Spasic, I., Ananiadou, S.: Automatic Discovery of Term Similarities Using Pattern Mining. Intl. Journal of Terminology 10(1), 55–80 (2004)
Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Heidelberg (1999)
Hindle, D.: Noun Classification from Predicate-Argument Structures. In: Proc. of the 28th Annual Meeting of the Association for Computational Linguistics (ACL 1990), Pittsburgh, PA, USA, June 1990, pp. 268–275 (1990)
Resnik, P.: Selectional Preference and Sense Disambiguation. In: Proc. of the ACL SIGLEX Workshop on Tagging Text with Lexical Semantics: Why, What, and How?, Washington, DC (1997)
Ganter, B., Reuter, K.: Finding all Closed Sets: A General Approach. Order 8(3), 283–290 (1991)
Srikant, R., Agrawal, R.: Mining Generalized Association Rules. In: Proc. of 21th Conf. on Very Large Data Bases (VLDB 1995), Zurich, Switzerland, September 1995, pp. 407–419. Morgan Kaufmann, San Francisco (1995)
Scheffer, T.: Finding Association Rules that Trade Support Optimally Against Confidence. Intelligent Data Analysis 9(4), 381–395 (2005)
Cimiano, P., Hartung, M., Ratsch, E.: Finding the Appropriate Generalization Level for Binary Relations Extracted from the Genia Corpus. In: Proc. of the Intern. Conf. on Language Resources and Evaluation (LREC 2006), ELRA, May 2006, pp. 161–169 (2006)
Hersh, W., Buckley, C., Leone, T., Hickam, D.: OHSUMED: An Interactive Retrieval Evaluation and New Large Test Collection for Research. In: Proc. of the 17th ACM SIGIR, Dublin, Ireland, pp. 192–201 (1994)
Milios, E., Zhang, Y., He, B., Dong, L.: Automatic Term Extraction and Document Similarity in Special Text Corpora. In: 6th Conf. of the Pacific Association for Computational Linguistics, Halifax, Canada, August 2003, pp. 22–25 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Drymonas, E., Zervanou, K., Petrakis, E.G.M. (2010). Unsupervised Ontology Acquisition from Plain Texts: The OntoGain System. In: Hopfe, C.J., Rezgui, Y., MĂ©tais, E., Preece, A., Li, H. (eds) Natural Language Processing and Information Systems. NLDB 2010. Lecture Notes in Computer Science, vol 6177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13881-2_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-13881-2_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13880-5
Online ISBN: 978-3-642-13881-2
eBook Packages: Computer ScienceComputer Science (R0)