Skip to main content

Heuristic Algorithm for Extraction of Facts Using Relational Model and Syntactic Data

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7094))

Abstract

From semantic point of view, information is usually contained in small units, called facts that are usually smaller than sentences. Identification of these facts in a text is not a trivial task. We present a heuristic algorithm for extraction of facts from sentences using a simple representation based on a relational data model. We focus our study on texts that contain a lot of facts by their nature: structured textbooks. The algorithm is based on data obtained by a syntactic analyzer. The obtained facts can be useful for information retrieval tasks, automatic summarization, etc. Our experiments are conducted for Spanish language. We obtained better results than the similar methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barker, K., Agashe, B., Chaw, S.-Y., Fan, J., Friedland, N., Glass, M., Hobbs, J., Hovy, E., Israel, D., Kim, D.S., Mulkar-Mehta, R., Patwardhan, S., Porter, B., Tecuci, D., Yeh, P.: Learning by reading: a prototype system, performance baseline and lessons learned. In: AAAI 2007: Proceedings of the 22nd National Conference on Artificial Intelligence, pp. 280–286. AAAI Press (2007)

    Google Scholar 

  2. Calvo, H., Gelbukh, A.: Automatic Semantic Role Labeling using Selectional Preferences with Very Large Corpora. Computación y Sistemas 12(1), 128–150 (2008)

    Google Scholar 

  3. Calvo, H., Gelbukh, A.: DILUCT: An Open-Source Spanish Dependency Parser Based on Rules, Heuristics, and Selectional Preferences. In: Kop, C., Fliedl, G., Mayr, H.C., Métais, E. (eds.) NLDB 2006. LNCS, vol. 3999, pp. 164–175. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  4. Hovy, E., Kwon, N., Zhou, L.: A semi-automatic evaluation scheme: automated nuggetization for manual annotation. In: Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, NAACL 2007, pp. 217–220 (2007)

    Google Scholar 

  5. Mann, G.: Multi Document Statistical Fact Extraction and Fusion. PhD Thesis, John Hopkins University, Maryland, 238 (2006)

    Google Scholar 

  6. Martínez-Santiago, F., García-Cumbreras, M.: Identifiación de formas lógicas en el caso del español: propuesta de un modelo basado en reglas y aprendizaje automático. In: Procesamiento del Lenguaje Natural, pp. 245–252 (2005)

    Google Scholar 

  7. Montes-y-Gómez, M., Gelbukh, A., López-López, A.: Mining the news: trends, associations, and deviations. Computación y Sistemas 5(1), 14–24 (2001)

    MATH  Google Scholar 

  8. Moreno, T., Moreno, G.: Lengua y Literatura 2, cuarta edn, Editorial Santillana, México (1991)

    Google Scholar 

  9. Mulkar, R., Hobbs, J., Hovy, E., Chalupsky, H., Lin, C.: Learning by reading: Two experiments. In: Proceedings of the IJCAI Workshop on Knowledge and Reasoning for Answering Questions, KRAQ (2007)

    Google Scholar 

  10. Nieto-López, J., Betancourt-Suárez, M., Nieto-López, R.: Historia 1, tercera edn, Sistemas Técnicos de Edición. México (1994)

    Google Scholar 

  11. Pasca, M., Lin, D., Bigham, J., Lifchits, A., Jain, A.: Names and Similarities on the Web: Fact Extraction in the Fast Lane. In: Proc. ACL 2006 (2006)

    Google Scholar 

  12. Padró, L., Collado, M., Reese, S., Lloberes, M., Castellón, I.: FreeLing 2.1: Five Years of Open-Source Language Processing Tools. In: Proceedings of 7th Language Resources and Evaluation Conference (LREC 2010), ELRA, La Valletta, Malta (May 2010)

    Google Scholar 

  13. Rincón, A., Rocha, A.: ABC de Física. Tercer curso, sexta edn, Editorial Herrero, México (1984)

    Google Scholar 

  14. Stephen, A., Jon, P.: Dependency based logical form transformations. In: Proceedings of the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text (2006)

    Google Scholar 

  15. Zhao, S., Betz, J.: Corroborate and Learn Facts from the Web (2006), http://140.122.184.128/presentation/08-03-06/Corroborate%20and%20Learn%20Facts%20from%20the%20Web.pdf

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sidorov, G., Herrera-de-la-Cruz, J.A., Galicia-Haro, S.N., Posadas-Durán, J.P., Chanona-Hernandez, L. (2011). Heuristic Algorithm for Extraction of Facts Using Relational Model and Syntactic Data. In: Batyrshin, I., Sidorov, G. (eds) Advances in Artificial Intelligence. MICAI 2011. Lecture Notes in Computer Science(), vol 7094. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25324-9_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25324-9_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25323-2

  • Online ISBN: 978-3-642-25324-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics