Abstract
Ontology instances are typically stored as triples which associate two named entities with a pre-defined relational description. Sometimes such triples can be incomplete in that one entity is known but the other entity is missing. The automatic discovery of the missing values is closely related to relation extraction systems that extract binary relations between two identified entities. Relation extraction systems rely on the availability of accurately named entities in that mislabelled entities can decrease the number of relations correctly identified. Although recent results demonstrate over 80% accuracy for recognising named entities, when input texts have less consistent patterns, the performance decreases rapidly. This paper presents OntotripleQA which is the application of question-answering techniques to relation extraction in order to reduce the reliance on the named entities and take into account other assessments when evaluating potential relations. Not only does this increase the number of relations extracted, but it also improves the accuracy of extracting relations by considering features which are not extractable with only comparisons of the named entities. A small dataset was collected to test the proposed approach and the experiment demonstrates that it is effective on sentences from Web documents with an accuracy of 68% on average.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aitken, J.S.: Learning information extraction rules: An inductive logic programming approach. In: Proc. of European Conf. on Artificial Intelligence ECAI, France, pp. 335–359 (2002)
Aone, C., Halverson, L., Hampton, T., Ramos-Santacruz, M.: SRA: Description of the IE system used for MUC-7 MUC-7 (1998)
Aone, C., Ramos-Santacruz, M.: REES: A Large-Scale Relation and Event Extraction System. In: Proc. of the 6th Applied Natural Language Processing Conf. U.S.A, pp. 76–83 (2000)
Clarke, C.L.A., Cormack, G.V., Kemkes, G., Laszlo, M., Lynam, T.R., Terra, E.L., Tilker, P.L.: Statistical Selection of Exact Answers. In: Proc. Text Retrieval Con., TREC (2002)
Ciravegna, F.: Adaptive Information Extraction from Text by Rule Induction and Generalisation. In: Proc. 17th Int. Joint Conf. on Artificial Intelligence Seattle (2001)
Crofts, N., Doerr, M., Gill, T.: The CIDOC Conceptual Reference Model: A standard for communicating cultural contents. Technical papers from CIDOC CRM (2003), available at http://cidoc.ics.forth.gr/docs/martin_a_2003_comm_cul_cont.htm
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: a framework and graphical development environment for robust NLP tools and applications. In: Proc. of the 40th Anniversary Meeting of the Association for Computational Linguistics, Philadelphia, USA, pp. 168–175 (2002)
Freitag, D.: Information Extraction from HTML: Application of a General Machine Learning Approach. In: Proc. of AAAI 1998, pp. 517–523 (1998)
Hermjakob, U., Echihabi, A., Marcu, D.: Natural Language based Reformulation Resource and Web Exploitation for Question Answering. In: Proc. of the Text Retrieval Con., TREC (2002)
Kim, S., Alani, H., Hall, W., Lewis, P.H., Millard, D.E., Shadbolt, N.R., Weal, M.W.: Artequakt: Generating Tailored Biographies with Automatically Annotated Fragments from the Web. In: Proc. of the Workshop on the Semantic Authoring, Annotation & Knowledge Markup in the European Conf. on Artificial Intelligence, France, pp. 1–6 (2002)
Kim, S., Lewis, P., Martinez, K.: The impact of enriching linguistic annotation on the performance of extracting relation triples. In: Proc. of Conf. on Intelligent Text Processing and Computational Linguistics, Korea, pp. 547–558 (2004)
Kwok, C. C., Etzioni, O., Weld, D. S.: Scaling Question Answering to the Web. In Proc. of the 10thInt. Conf. on World Wide Web (2001)
Litkowski, K.C.: Question-Answering Using Semantic Relation Triples. In: Proc. of the 8th Text Retrieval Conf. (TREC-8), pp. 349–356 (1999)
Magniti, B., Negri, M., Prevete, R., Tanev, H.: Mining Knowledge from Repeated Co-occurrences: DIOGENE. In Proc. of the Text Retrieval Con. (TREC) (2002)
Marsh, E., Perzanowski, D.: MUC-7 Evaluation of IE Technology: Overview of Results (1998), available at http://www.itl.nist.gov/iaui/894.02/related_projects/muc/index.html
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to wordnet: An on-line lexical database. Technical report University of Princeton U.S.A (1993)
Nirenburg, S., McShane, M., Beale, S.: Enhancing Recall in Information Extraction through Ontological Semantics. In: Proc. of Workshop on Ontologies and Information Extraction conjunction with The Semantic Web and Language Technology Romania (2003)
Nyberg, E., Mitamura, T., Carbonnell, J., Callan, J., Collins-Thompson, K., Czuba, K., Duggan, K., Hiyakumoto, L., Hu, N., Huang, Y., Ko, J., et al.: The JAVELIN Question-Answering System at TREC Carnegie Mellon University. In: Proc. of the Text Retrieval Con, TREC (2002)
Roth, D., Yih, W.T.: Probabilistic reasoning for entity & relation recognition. In: Proc. of the 19th Int. Conf. on Computational Intelligence (2002)
Salton, G., Lesk, M.E.: Computer Evaluation of Indexing and Text Processing. In: Salton, G. (ed.) The Smart Retrieval System-Experiment in Automatic Document Processing, Prentice-Hall, Englewood Cliffs (1971)
Sekine, S., Grishman, R.: A corpus-based probabilistic grammar with only two non-terminals. In: Proc. of the 1st International Workshop on Multimedia annotation, Japan (2001)
Voorhess, E.M.: Overview of the TREC 2002 Question Answering Track. In: Proc. of the Text Retrieval Con., TREC (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, S., Lewis, P., Martinez, K., Goodall, S. (2004). Question Answering Towards Automatic Augmentations of Ontology Instances. In: Bussler, C.J., Davies, J., Fensel, D., Studer, R. (eds) The Semantic Web: Research and Applications. ESWS 2004. Lecture Notes in Computer Science, vol 3053. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25956-5_11
Download citation
DOI: https://doi.org/10.1007/978-3-540-25956-5_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21999-6
Online ISBN: 978-3-540-25956-5
eBook Packages: Springer Book Archive