Lexically Evaluating Ontology Triples Generated Automatically from Texts

  • Peter Spyns
  • Marie-Laure Reinberger
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3532)


Our purpose is to present a method to lexically evaluate the results of extracting in an unsupervised way material from text corpora to build ontologies. We have worked on a legal corpus (EU VAT directive) consisting of 43K words. The unsupervised text miner has produced a set of triples. These are to be used as preprocessed material for the construction of ontologies from scratch. A quantitative scoring method (coverage, accuracy, recall and precision metrics resulting in a 38.68%, 52.1%, 9.84% and 75.81% scores respectively) has been defined and applied.


Frequency Class Characteristic Word Grammatical Relation Ontology Learn Relevant Word 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Berners-Lee, T.: Weaving the Web. Harper, New York (1999)Google Scholar
  2. 2.
    Brewster, C., Alani, H., Dasmahapatra, S., Wilks, Y.: Data Driven Ontology Evaluation. In: Shadbolt, N., O’Hara, K. (eds.), Advanced Knowledge Technologies: selected papers 2004, p. 164 (2004) (reprint from LREC 2004)Google Scholar
  3. 3.
    Buchholz, S., Veenstra, J., Daelemans, W.: Cascaded grammatical relation assignment. In: Proceedings of EMNLP/VLC 1999. PrintPartners Ipskamp (1999)Google Scholar
  4. 4.
    Buitelaar, P., Handschuh, S., Magnini, B. (eds.): Proc. of the ECAI 2004 Workshop on Ontology Learning and Population (2004)Google Scholar
  5. 5.
    Buitelaar, P., Cimiano, P., Magnini, B. (eds.): Ontology Learning from Text: Methods, Applications and Evaluation. IOS Press, Amsterdam (2005) (forthcoming)Google Scholar
  6. 6.
    De Kock, J.: Elementos para una estilística computacional - tomo I. Editorial Coloquio, Madrid (1984)Google Scholar
  7. 7.
    Gillam, L., Tariq, M.: Ontology via Terminology. In: Ibekwe-San Juan, F., LainïCruzel, S. (eds.), Proceedings of the Workshop on Terminology, Ontology and Knowledge Representation (2004),
  8. 8.
    Gómez-Pérez, A., Fernández-López, M., Corcho, O.: Ontological Engineering. Springer, Heidelberg (2003)Google Scholar
  9. 9.
    Gruber, T.R.: A Translation Approach to Portable Ontology Specifications. Knowledge Acquisition 6(2), 199–221 (1993)CrossRefGoogle Scholar
  10. 10.
    Guarino, N., Giaretta, P.: Ontologies and knowledge bases: Towards a terminological clarification. In: Mars, N. (ed.) Towards Very Large Knowledge Bases: Knowledge Building and Knowledge Sharing, pp. 25–32. IOS Press, Amsterdam (1995)Google Scholar
  11. 11.
    Guarino, N.: Towards a Formal Evaluation of ontological Quality. IEEE Intelligent System 19(4), 78–80 (2004)Google Scholar
  12. 12.
    Hartmann, J., Spyns, P., Maynard, D., Cuel, R., de Figueroa, M.C.S., Sure, Y.: Methods for Ontology Evaluation, KnowledgeWeb Deliverable #D1.2.3 (2005)Google Scholar
  13. 13.
    Humphreys, B., Lindberg, D.: The unified medical language system project: a distributed experiment in improving access to biomedical information. In: Lun, K.C. (ed.) Proc. of the 7th World Congress on Medical Informatics (MEDINFO 1992), pp. 1496–1500 (1992)Google Scholar
  14. 14.
    Karanikas, H., Theodoulidis, B.: Knowledge discovery in text and text mining software, Technical report, UMIST - CRIM, Manchester (2002)Google Scholar
  15. 15.
    Luhn, H.P.: The automatic creation of literature abstracts. IBM Journal of Research and Development 2(2), 159–195 (1958)CrossRefMathSciNetGoogle Scholar
  16. 16.
    Maedche, A., Staab, S.: Measuring Similarity between Ontologies. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, pp. 251–263. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  17. 17.
    Meersman, R.: Ontologies and databases: More than a fleeting resemblance. In: d’Atri, A., Missikoff, M. (eds.) OES/SEO 2001 Rome Workshop, Luiss Publications (2001)Google Scholar
  18. 18.
    Navigli, R., Velardi, P.: Learning Domain Ontologies from Document Warehouses and Dedicated Web Sites. Computational Linguistics 30(2), 151–179 (2004)CrossRefGoogle Scholar
  19. 19.
    Pedersen, T., Patwardhan, S., Michelizzi, J.: Wordnet: similarity - measuring the relatedness of concepts. In: The Proceedings of the Nineteenth National Conference on Artificial Intelligence, AAAI 2004 (2004)Google Scholar
  20. 20.
    Reinberger, M.-L., Spyns, P., Daelemans, W., Meersman, R.: Mining for lexons: Applying unsupervised learning methods to create ontology bases. In: Meersman, R., Tari, Z., Schmidt, D.C. (eds.) CoopIS 2003, DOA 2003, and ODBASE 2003. LNCS, vol. 2888, pp. 803–819. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  21. 21.
    Reinberger, M.-L., Spyns, P., Johannes Pretorius, A., Daelemans, W.: Automatic initiation of an ontology. In: Meersman, R., Tari, Z. (eds.) OTM 2004. LNCS, vol. 3290, pp. 600–617. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  22. 22.
    Reinberger, M.-L., Spyns, P.: Unsupervised Text Mining for the Learning of DOGMA-inspired Ontologies. In: Buitelaar, P., Cimiano, P., Magnini, B. (eds.) Ontology Learning from Text: Methods, Applications and Evaluation. IOS Press, Amsterdam (2005)Google Scholar
  23. 23.
    Sabou, M.: Extracting Ontologies from Software Documentation: a Semi-automatic Method and its Evaluation. In: Buitelaar, P., Cimiano, P., Magnini, B. (eds.) Ontology Learning from Text: Methods, Applications and Evaluation. IOS Press, Amsterdam (2005)Google Scholar
  24. 24.
    Spyns, P., Meersman, R., Jarrar, M.: Data modelling versus ontology engineering. SIGMOD Record Special Issue 31(4), 12–17 (2002)Google Scholar
  25. 25.
    Spyns, P., Johannes Pretorius, A., Reinberger, M.-L.: Evaluating DOGMA-lexons generated automatically from a text corpus. In: Cimiano, P., Ciravegna, F., Motta, E., Uren V. (eds.), Proceedings of the EKAW 2004 Workshop on Human Language Technology and Knowledge Management, pp. 38–44 (2004)Google Scholar
  26. 26.
    Spyns, P., De Bo, J.: Ontologies: a revamped cross-disciplinary buzzword or a truly promising interdisciplinary research topic? Linguistica Antverpiensia, new series (3) (2004) (forthcoming)Google Scholar
  27. 27.
    Uschold, M., Gruninger, M.: Ontologies: Principles, methods and applications. Knowledge Sharing and Review 11(2) (June 1996)Google Scholar
  28. 28.
    Ushold, M.: Where are the semantics in the semantic web? AI Magazine 24(3), 25–36 (2003)Google Scholar
  29. 29.
    van Rijsbergen, C.: Information Retrieval. Butterworths, London (1979)Google Scholar
  30. 30.
    Velardi, P., Missikoff, M., Basili, R.: Identification of relevant terms to support the construction of Domain Ontologies. In: Maybury M., Bernsen N., Krauwer S. (eds.) Proc. of the ACL-EACL Workshop on Human Language Technologies (2001)Google Scholar
  31. 31.
    Zipf, G.K.: Human Behaviour and the Principle of Least-Effort. Addison-Wesley, Cambridge (1949)Google Scholar
  32. 32.
    Tawk Compiler v.5. Thompson Automation Software, Jefferson OR, USGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Peter Spyns
    • 1
  • Marie-Laure Reinberger
    • 2
  1. 1.STAR LabVrije Universiteit BrusselBrusselBelgium
  2. 2.CNTSUniversity of AntwerpWilrijkBelgium

Personalised recommendations