Abstract
Ontologies play a pervasive role in many areas of IT. Over the last decade a substantial number of ontologies have been developed. However, while looking for a specific ontology it is difficult to find the right one because of the problems of the ontology unavailability or inadequacy. Although many ontology learning methods already exist, there are no comprehensive models of the whole process of the ontology learning from text. In this article, the metamodel of the ontology learning from text is presented. The approach is based on the survey of the existing methods, while evaluation is provided in the form of a reference implementation of the introduced metamodel.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
Last updated: 4.04.2008.
- 8.
Last updated: 4.04.2008.
- 9.
Last updated: 4.04.2008.
- 10.
http://lcl.di.uniroma1.it/people.jsp. Last accessed: 15.05.2008.
- 11.
- 12.
Some authors prefer to exclude an instance extraction from a concept extraction task from historical divisions into terminological (TBox) and assertional knowledge (ABox). Still, the methods used for these two tasks in the ontology learning are similar.
- 13.
- 14.
- 15.
We had the pleasure to welcome a KMi researcher from Open University as a guest at our university.
References
Abramowicz, W., Vargas-Vera, M., Wisniewski, M.: Axiom-based feedback cycle for relation extraction in ontology learning from text. In: DEXA ’08: Proceedings of the 19th International Conference on Database and Expert Systems Applications. IEEE Comput. Soc., Los Alamitos (2008)
Abramowicz, W., Wisniewski, M.: Proximity window context method for term extraction in ontology learning from text. In: DEXA ’08: Proceedings of the 19th International Conference on Database and Expert Systems Applications. IEEE Comput. Soc., Los Alamitos (2008)
Adar, E.: Sarad: A simple and robust abbreviation dictionary. Bioinformatics 20(4), 527–533 (2004)
Agirre, E., Ansa, O., Hovy, E., Martínez, D.: Enriching very large ontologies using the www. In: Proc. of the Ontology Learning Workshop, ECAI, Berlin, Germany (2000)
Alfonseca, E., Manandhar, S.: Extending a lexical ontology by a combination of distributional semantics signatures. In: Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2002) (2002)
Alfonseca, E., Manandhar, S.: Improving an ontology refinement method with hyponymy patterns. In: Language Resources and Evaluation (LREC-2002), Las Palmas, Spain (2002)
Aussenac-Gilles, N., Biébow, B., Szulman, S.: Revisiting ontology design: A methodology based on corpus analysis. In: Proceedings of the 12th European Workshop on Knowledge Acquisition, Modeling and Management. Springer, Berlin (2000)
Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F.: The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press, Cambridge (2003)
Basili, R., Pazienza, M.T., Velardi, P.: An empirical symbolic approach to natural language processing. Artificial Intelligence 85, 59–99 (1996)
Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Scientific American 284(5) (2001)
Brody, S., Navigli, R., Lapata, M.: Ensemble methods for unsupervised WSD. In: ACL ’06: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the ACL, pp. 97–104, Association for Computational Linguistics, Morristown, NJ, USA (2006)
Budanitsky, A., Hirst, G.: Evaluating WordNet-based measures of lexical semantic relatedness. Computational Linguistics 32(1), 13–47 (2006)
Buitelaar, P., Cimiano, P.: Ontology learning from text: Tutorial. In: 11th Conference of the European Chapter of the Association for Computational Linguistics, Trento, Italy (2006)
Buitelaar, P., Cimiano, P., Magnini, B.: Ontology learning from text: An overview. In: Buitelaar, P., Cimiano, P., Magnini, B. (eds.) Ontology Learning from Text: Methods, Evaluation and Applications. Frontiers in Artificial Intelligence and Applications. IOS Press, Amsterdam (2005)
Buitelaar, P., Olejnik, D., Sintek, M.: A protege plug-in for ontology extraction from text based on linguistic analysis. In: Proceedings of the 1st European Semantic Web Symposium (ESWS) (2004)
Buitelaar, P., Sintek, M.: Ontolt version 1.0: Middleware for ontology extraction from text. In: Proceedings of the Demo Session at the International Semantic Web Conference (ISWC) (2004)
Bunescu, R., Mooney, R.: Learning to extract relations from the web using minimal supervision. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Association for Computational Linguistics, Prague, Czech Republic (June 2007)
Caraballo, S.A.: Automatic construction of a hypernym-labeled noun hierarchy from text. In: Proceedings of the Conference of the Association for Computational Linguistics (1999)
Caraballo, S.A.: Automatic construction of a hypernym—labeled noun hierarchy from text. PhD thesis, Providence, RI, USA, 2001. Adviser—E. Charniak
Cederberg, S., Widdows, D.: Using LSA and noun coordination information to improve the precision and recall of automatic hyponymy extraction. In: Proceedings of the Conference on Natural Language Learning (CoNNL) (2003)
Chang, J., Schutze, H.: Abbreviations in biomedical text. In: Ananiadou, S., Mcnaught, J. (eds.) Text Mining for Biology and Biomedicine, pp. 99–119. Artech House, Norwood (2006)
Charniak, E., Berland, M.: Finding parts in very large corpora. In: Proceedings of the 37th Annual Meeting of the ACL (1999)
Cimiano, P.: Ontology learning from text. PhD thesis, University of Karlsruhe (2006)
Cimiano, P., Hotho, A., Staab, S.: Learning concept hierarchies from text corpora using formal concept analysis. Journal of Artificial Intelligence Research 24, 305–339 (2005)
Cimiano, P., Schmidt-Thieme, L., Pivk, A., Staab, S.: Learning taxonomic relations from heterogeneous evidence. In: Ontology Learning from Text: Methods, Applications and Evaluation, pp. 59–73. IOS Press, Amsterdam (2005)
Cimiano, P., Staab, S.: Learning concept hierarchies from text with a guided agglomerative clustering algorithm. In: ICML 2005 Workshop on Learning and Extending Lexical Ontologies with Machine Learning Methods (2005)
Cimiano, P., Völker, J.: Text2onto—a framework for ontology learning and data-driven change discovery. In: 10th International Conference on Applications of Natural Language to Information Systems (NLDB’2005) (2005)
Cimiano, P., Wenderoth, J.: Automatically learning qualia structures from the web. In: Proceedings of the ACL Workshop on Deep Lexical Acquisition (2005)
Cristani, M., Cuel, R.: A survey on ontology creation methodologies. International Journal on Semantic Web and Information Systems 1(2), 49–69 (2005)
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: Gate: A framework and graphical development environment for robust NLP tools and applications. In: Proceedings of the 40th Annual Meeting of the ACL (2002)
Cunningham, H., Maynard, D., Tablan, V.: Jape: A java annotation patterns engine (2nd edn.). Technical report, Department of Computer Science, University of Sheffield (November 2000)
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V., Ursu, C., Dimitrov, M., Dowman, M., Aswani, N., Roberts, I., Li, Y., Shafirin, A.: Developing Language Processing Components with GATE Version 4. Department of Computer Science, University of Sheffield, 4.0-beta1 edition, April 2007
Daille, B.: Study and implementation of combined techniques for automatic extraction of terminology. In: Klavans, J., Resnik, P. (eds.) The Balancing Act: Combining Symbolic and Statistical Approaches to Language, pp. 49–66. MIT Press, Cambridge (1996)
Dunning, T.: Accurate methods for the statistics of surprise and coincidence. Computational Linguistics 19(1), 61–74 (1993)
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Fensel, D.: Ontologies: A Silver Bullet for Knowledge Management and Electronic Commerce. Springer, New York (2003)
Fensel, D., van Harmelen, F., Klein, M., Akkermans, H., Broekstra, J., Fluit, C., van der Meer, J., Schnurr, H.-P., Studer, R., Hughes, J., Krohn, U., Davies, J., Engels, R., Bremdal, B., Ygge, F., Lau, T., Novotny, B., Reimer, U., Horrocks, I.: Onto-knowledge: Ontology-based tools for knowledge management. In: Proceedings of the eBusiness and eWork 2000 (eBeW’00) Conference, Madrid, Spain (2000)
Fotzo, H.N., Gallinari, P.: Learning generalization/specialization relations between concepts—application for automatically building thematic document hierarchies. In: RIAO (2004)
Frantzi, K., Ananiadou, S., Mima, H.: Automatic recognition of multi-word terms: The C-value/NC-value method. International Journal on Digital Libraries 3(2), 115–130 (2000)
Girju, R., Moldovan, D.: Text mining for causal relations. In: Proceedings of the FLAIRS Conference (2002)
Grefenstette, G.: Cross-Language Information Retrieval. Kluwer International Series on Information Retrieval. Kluwer Academic, Boston (1998)
Gruber, T.R.: A translation approach to portable ontology specifications. Knowledge Acquisition 5(2), 199–220 (1993)
Gruber, T.: Ontology. In: Liu, L., Tamer Ozsu, M. (eds.) Encyclopedia of Database Systems. Springer, Berlin (2008)
Haase, P., Völker, J.: Ontology learning and reasoning—dealing with uncertainty and inconsistency. In: Proceedings of the Workshop on Uncertainty Reasoning for the Semantic Web (URSW) (2005)
Hammerton, J., Osborne, M., Armstrong, S., Daelemans, W.: Introduction to special issue on machine learning approaches to shallow parsing. Journal of Machine Learning Research 2002(2), 8 (2002)
Hamp, B., Feldweg, H.: Germanet—a lexical-semantic net for German. In: Proceedings of ACL Workshop Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications, Madrid, Spain (1997)
Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: 14th International Conference on Computational Linguistics (1992)
Hearst, M.A.: Automated discovery of WordNet relations. In: Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database and Some of its Applications, pp. 132–152. MIT Press, Cambridge (1998)
Hepp, M.: Products and services ontologies: A methodology for deriving owl ontologies from industrial categorization standards. International Journal on Semantic Web and Information Systems 2(1), 72–99 (2006)
Hepp, M.: Ontologies: State of the art, business potential, and grand challenges. In: Ontology Management, pp. 3–22. Springer, Berlin (2008)
Hepp, M., De Leenheer, P., de Moor, A., Sure, Y.: Ontology Management, Semantic Web, Semantic Web Services, and Business Applications. Semantic Web and Beyond Computing for Human Experience, vol. 7. Springer, Berlin (2008)
Hepple, M.: Independence and commitment: Assumptions for rapid training and execution of rule-based POS taggers. In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL-2000), Hong Kong (2000)
Huang, J.-X., Shin, J.-A., Choi, K.-S.: Integrating relations for a domain ontology. In: Proceedings of the 6th International Semantic Web Conference, Busan, Korea (November 2007)
International Organization for Standardization. ISO 1087-1:2000 Terminology Work—Vocabulary—Part 1: Theory and Application (2000)
International Organization for Standardization. ISO 704:2000 Terminology Work—Principles and Methods (2000)
International Organization for Standardization. ISO 860:2007 Terminology Work—Harmonization of Concepts and Terms (2007)
Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice-Hall, Upper Saddle River (2000)
Kauffman, R.J., Walden, E.A.: Economics and electronic commerce: Survey and directions for research. International Journal of Electronic Commerce 5(4), 5–116 (2001)
Kenter, T., Maynard., D.: Using GATE as an annotation tool. Department of Computer Science, University of Sheffield (January 2005)
Kietz, J., Maedche, A., Volz, R.: A method for semi-automatic ontology acquisition from a corporate intranet. In: Workshop “Ontologies and Text”, co-located with EKAW’2000 (2000)
Kipfer, B.A.: Roget new millennium thesaurus, 1st edn. (v 1.1.1), 2006-04-03 (2006)
Lin, D., Pantel, P.: Dirt—discovery of inference rules from text. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2001)
Maedche, A.: Ontology Learning for the Semantic Web. Kluwer Academic, Boston (2002)
Maedche, A., Staab, S.: Discovering conceptual relations from text. In: ECAI 2000. Proceedings of the 14th European Conference on Artificial Intelligence, Berlin, Germany. IOS Press, Amsterdam (2000)
Maedche, A., Staab, S.: Semi-automatic engineering of ontologies from text. In: Proceedings of the 12th International Conference on Software Engineering and Knowledge Engineering (2000)
Maedche, A., Staab, S.: The text-to-onto ontology learning environment. In: Proceedings of the 12th Internal Conference on Software and Knowledge Engineering, Chicago, USA (2000)
Manning, C.D., Schutze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
Medjahed, B., Benatallah, B., Bouguettaya, A., Ngu, A.H.H., Elmagarmid, A.K.: Business-to-business interactions: Issues and enabling technologies. The VLDB Journal 12(1), 59–85 (2003)
Missikoff, M., Navigli, R., Velardi, P.: Integrated approach to web ontology learning and engineering. IEEE Computer 35(11), 60–63 (2002)
Nadeau, D.: Balie—baseline information extraction. Multilingual information extraction from text with machine learning and natural language techniques. Technical report, School of Information Technology and Engineering, University of Ottawa, Canada (2005)
Narayanan, S., Petruck, M.R.L., Baker, C.F., Fillmore, C.J.: Putting FrameNet data into the ISO linguistic annotation framework. In: Proceedings of the ACL 2003 Workshop on Linguistic Annotation, pp. 22–29, Association for Computational Linguistics, Morristown, NJ, USA (2003)
Navigli, R.: Meaningful clustering of senses helps boost word sense disambiguation performance. In: ACL ’06: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the ACL, pp. 105–112, Association for Computational Linguistics, Morristown, NJ, USA (2006)
Navigli, R., Velardi, P.: Learning domain ontologies from document warehouses and dedicated web sites. Computational Linguistics 30(2), 151–179 (2004)
Navigli, R., Velardi, P.: Structural semantic interconnections: A knowledge-based approach to word sense disambiguation. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(7), 1075–1086 (2005)
Navigli, R., Velardi, P., Cucchiarelli, A., Neri, F.: Quantitative and qualitative evaluation of the OntoLearn ontology learning system. In: COLING ’04: Proceedings of the 20th International Conference on Computational Linguistics, p. 1043, Association for Computational Linguistics, Morristown, NJ, USA (2004)
Nenadic, G., Ananiadou, S., McNaught, J.: Enhancing automatic term recognition through recognition of variation. In: COLING ’04: Proceedings of the 20th International Conference on Computational Linguistics, p. 604, Association for Computational Linguistics, Morristown, NJ, USA (2004)
Niles, I., Pease, A.: Towards a standard upper ontology. In: FOIS ’01: Proceedings of the International Conference on Formal Ontology in Information Systems. ACM, New York (2001)
Ogden, C.K., Richards, I.A.: The Meaning of Meaning: A Study of the Influence of Language upon Thought and of the Science of Symbolism. International Library of Psychology, Philosophy, and Scientific Method. Harcourt Brace, New York (1923)
Okazaki, N., Ananiadou, S.: A term recognition approach to acronym recognition. In: Proceedings of the COLING/ACL on Main Conference Poster Sessions, Association for Computational Linguistics, Morristown, NJ, USA (2006)
Piasecki, M., Broda, B.: Semantic similarity measure of Polish nouns based on linguistic features. In: Proceedings of 10th International Conference on Business Information Systems, Poznan, Poland. Lecture Notes in Computer Science. Springer, Berlin (2007)
Pinto, H.S., Martins, J.P.: Ontologies: How can they be built? Knowledge and Information Systems 6(4), 441–464 (2004)
Piskorski, J., Drozdzynski, W., Krieger, H.-U., Schafer, U.: Sprout—a general-purpose NLP framework integrating finite-state and unification-based grammar formalisms. In: Proceedings of the 5th International Workshop on Finite-State Methods and Natural Language Processing, Helsinki, Finland. Lecture Notes in Artificial Intelligence. Springer, Berlin (2005)
Poesio, M., Almuhareb, A.: Identifying concept attributes using a classifier. In: Proceedings of the ACL Workshop on Deep Lexical Acquisition (2005)
Poesio, M., Ishikawa, T., Schulte im Walde, S., Vieira, R.: Acquiring lexical knowledge for anaphora resolution. In: Proceedings of the 3rd Conference on Language Resources and Evaluation (2002)
Rinaldi, F., Yuste, E.: Exploiting technical terminology for knowledge management. In: Buitelaar, P., Cimiano, P., Magnini, B. (eds.) Ontology Learning from Text: Methods, Evaluation and Applications. Frontiers in Artificial Intelligence and Applications. IOS Press, Amsterdam (2005)
Roux, C., Proux, D., Rechenmann, F., Julliard, L.: An ontology enrichment method for a pragmatic information extraction system gathering data on genetic interactions. In: Proceedings of the ECAI2000 Workshop on Ontology Learning (OL2000), Berlin, Germany (2000)
Sanderson, M., Croft, B.: Deriving concept hierarchies from text. In: SIGIR ’99. ACM, New York (1999)
Schwartz, A., Hearst, M.: A simple algorithm for identifying abbreviation definitions in biomedical texts. In: Proceedings of the Pacific Symposium on Biocomputing PSB 2003 (2003)
Simperl, E.P.B., Sure, Y., Tempich, C.: Ontocom: A cost estimation model for ontology engineering. In: Proceedings of the 5th International Semantic Web Conference, Athens, Georgia (November 2006)
Singh, R., Iyer, L.S., Salam, A.F.: Semantic ebusiness. International Journal on Semantic Web and Information Systems 1(1), 19–35 (2005)
Sintek, M., Buitelaar, P., Olejnik, D.: A formalization of ontology learning from text. In: International Semantic Web Conference. Hiroshima, Japan (2004)
Smadja, F.: Retrieving collocations from text: Xtract. Computational Linguistics 19(1), 143–177 (1993)
Snow, R., Jurafsky, D., Ng, A.Y.: Semantic taxonomy induction from heterogeneous evidence. In: ACL ’06: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the ACL, pp. 801–808, Association for Computational Linguistics, Morristown, NJ, USA (2006)
Sowa, J.F.: Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley, Reading (1984)
Sowa, J.F.: Knowledge Representation: Logical, Philosophical, and Computational Foundations. Brooks/Cole, Pacific Grove (2000)
Sowa, J.F.: Ontology, metadata, and semiotics. In: Proceedings of the Linguistic on Conceptual Structures: Logical Linguistic, and Computational Issues. Springer, Berlin (2000)
Sundblad, H.: Automatic acquisition of hyponyms and meronyms from question corpora. In: Proceedings of the Workshop on Natural Language Processing and Machine Learning for Ontology Engineering at ECAI’2002. Lyon, France (2003)
Torii, M., Liu, H., Hu, Z., Wu, C.: A comparison study of biomedical short form definition detection algorithms. In: TMBIO ’06: Proceedings of the 1st International Workshop on Text Mining in Bioinformatics, pp. 52–59. ACM, New York (2006)
Uschold, M., Gruninger, M.: Ontologies and semantics for seamless connectivity. SIGMOD Record 33(4), 58–64 (2004)
Velardi, P., Fabriani, P., Missikoff, M.: Using text processing techniques to automatically enrich a domain ontology. In: Proceedings of the International Conference on Formal Ontology in Information Systems (FOIS) (2001)
Vieira, R., Poesio, M.: An empirically based system for processing definite descriptions. Computational Linguistics 26(4), 539–593 (2000)
Vossen, P.: Introduction to EuroWordNet. Computers and the Humanities 32(2–3), 73–89 (1998)
Wermter, J., Hahn, U.: Paradigmatic modifiability statistics for the extraction of complex multi-word terms. In: HLT ’05: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 843–850, Association for Computational Linguistics, Morristown, NJ, USA (2005)
Wermter, J., Hahn, U.: You can’t beat frequency (unless you use linguistic knowledge): a qualitative evaluation of association measures for collocation and term extraction. In: ACL ’06: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the ACL, pp. 785–792, Association for Computational Linguistics, Morristown, NJ, USA (2006)
Widdows, D.: Unsupervised method for developing taxonomies by combining syntactic and statistical information. In: Proceedings of HLT/NAACL (2003)
Witschel, H.F.: Using decision trees and text mining techniques for extending taxonomies. In: Proceedings of Learning and Extending Lexical Ontologies by using Machine Learning Methods, Workshop at ICML-05 (2005)
Xu, F., Kurz, D., Piskorski, J., Schmeier, S.: A domain adaptive approach to automatic acquisition of domain relevant terms and their relations with bootstrapping. In: Proceedings of the 3rd International Conference on Language Resources an Evaluation (LREC’02), Las Palmas, Canary Islands, Spain (2002)
Yamada, I., Baldwin, T.: Automatic discovery of telic and agentive roles from corpus data. In: Proceedings of the 18th Pacific Asia Conference on Language, Information and Computation (PACLIC 18) (2004)
Yarowsky, D.: Word-sense disambiguation using statistical models of Roget’s categories trained on large corpora. In: Proceedings of COLING-92, Nantes, France (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag London
About this chapter
Cite this chapter
Wisniewski, M. (2010). Metamodel of Ontology Learning from Text. In: Badr, Y., Chbeir, R., Abraham, A., Hassanien, AE. (eds) Emergent Web Intelligence: Advanced Semantic Technologies. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-84996-077-9_10
Download citation
DOI: https://doi.org/10.1007/978-1-84996-077-9_10
Publisher Name: Springer, London
Print ISBN: 978-1-84996-076-2
Online ISBN: 978-1-84996-077-9
eBook Packages: Computer ScienceComputer Science (R0)