Heuristic Algorithm for Extraction of Facts Using Relational Model and Syntactic Data

Sidorov, Grigori; Herrera-de-la-Cruz, Juve Andrea; Galicia-Haro, Sofía N.; Posadas-Durán, Juan Pablo; Chanona-Hernandez, Liliana

doi:10.1007/978-3-642-25324-9_28

Heuristic Algorithm for Extraction of Facts Using Relational Model and Syntactic Data

Grigori Sidorov²¹,
Juve Andrea Herrera-de-la-Cruz²¹,
Sofía N. Galicia-Haro²²,
Juan Pablo Posadas-Durán²¹ &
…
Liliana Chanona-Hernandez²³

Conference paper

1290 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7094))

Abstract

From semantic point of view, information is usually contained in small units, called facts that are usually smaller than sentences. Identification of these facts in a text is not a trivial task. We present a heuristic algorithm for extraction of facts from sentences using a simple representation based on a relational data model. We focus our study on texts that contain a lot of facts by their nature: structured textbooks. The algorithm is based on data obtained by a syntactic analyzer. The obtained facts can be useful for information retrieval tasks, automatic summarization, etc. Our experiments are conducted for Spanish language. We obtained better results than the similar methods.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barker, K., Agashe, B., Chaw, S.-Y., Fan, J., Friedland, N., Glass, M., Hobbs, J., Hovy, E., Israel, D., Kim, D.S., Mulkar-Mehta, R., Patwardhan, S., Porter, B., Tecuci, D., Yeh, P.: Learning by reading: a prototype system, performance baseline and lessons learned. In: AAAI 2007: Proceedings of the 22nd National Conference on Artificial Intelligence, pp. 280–286. AAAI Press (2007)
Google Scholar
Calvo, H., Gelbukh, A.: Automatic Semantic Role Labeling using Selectional Preferences with Very Large Corpora. Computación y Sistemas 12(1), 128–150 (2008)
Google Scholar
Calvo, H., Gelbukh, A.: DILUCT: An Open-Source Spanish Dependency Parser Based on Rules, Heuristics, and Selectional Preferences. In: Kop, C., Fliedl, G., Mayr, H.C., Métais, E. (eds.) NLDB 2006. LNCS, vol. 3999, pp. 164–175. Springer, Heidelberg (2006)
Chapter Google Scholar
Hovy, E., Kwon, N., Zhou, L.: A semi-automatic evaluation scheme: automated nuggetization for manual annotation. In: Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, NAACL 2007, pp. 217–220 (2007)
Google Scholar
Mann, G.: Multi Document Statistical Fact Extraction and Fusion. PhD Thesis, John Hopkins University, Maryland, 238 (2006)
Google Scholar
Martínez-Santiago, F., García-Cumbreras, M.: Identifiación de formas lógicas en el caso del español: propuesta de un modelo basado en reglas y aprendizaje automático. In: Procesamiento del Lenguaje Natural, pp. 245–252 (2005)
Google Scholar
Montes-y-Gómez, M., Gelbukh, A., López-López, A.: Mining the news: trends, associations, and deviations. Computación y Sistemas 5(1), 14–24 (2001)
MATH Google Scholar
Moreno, T., Moreno, G.: Lengua y Literatura 2, cuarta edn, Editorial Santillana, México (1991)
Google Scholar
Mulkar, R., Hobbs, J., Hovy, E., Chalupsky, H., Lin, C.: Learning by reading: Two experiments. In: Proceedings of the IJCAI Workshop on Knowledge and Reasoning for Answering Questions, KRAQ (2007)
Google Scholar
Nieto-López, J., Betancourt-Suárez, M., Nieto-López, R.: Historia 1, tercera edn, Sistemas Técnicos de Edición. México (1994)
Google Scholar
Pasca, M., Lin, D., Bigham, J., Lifchits, A., Jain, A.: Names and Similarities on the Web: Fact Extraction in the Fast Lane. In: Proc. ACL 2006 (2006)
Google Scholar
Padró, L., Collado, M., Reese, S., Lloberes, M., Castellón, I.: FreeLing 2.1: Five Years of Open-Source Language Processing Tools. In: Proceedings of 7th Language Resources and Evaluation Conference (LREC 2010), ELRA, La Valletta, Malta (May 2010)
Google Scholar
Rincón, A., Rocha, A.: ABC de Física. Tercer curso, sexta edn, Editorial Herrero, México (1984)
Google Scholar
Stephen, A., Jon, P.: Dependency based logical form transformations. In: Proceedings of the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text (2006)
Google Scholar
Zhao, S., Betz, J.: Corroborate and Learn Facts from the Web (2006), http://140.122.184.128/presentation/08-03-06/Corroborate%20and%20Learn%20Facts%20from%20the%20Web.pdf

Download references

Author information

Authors and Affiliations

Natural Language and Text Processing Laboratory, Center for Computing Research (CIC), National Polytechnic Institute (IPN), Av. Juan Dios Batiz, s/n, Zacatenco, 07738, Mexico City, Mexico
Grigori Sidorov, Juve Andrea Herrera-de-la-Cruz & Juan Pablo Posadas-Durán
Faculty of sciences, Autonomous National University of Mexico (UNAM), Mexico City, Mexico
Sofía N. Galicia-Haro
Engineering faculty (ESIME), National Polytechnic Institute (IPN), Av. Juan de Dios Bátiz, Zacatenco, 07738, Mexico City, Mexico
Liliana Chanona-Hernandez

Authors

Grigori Sidorov
View author publications
You can also search for this author in PubMed Google Scholar
Juve Andrea Herrera-de-la-Cruz
View author publications
You can also search for this author in PubMed Google Scholar
Sofía N. Galicia-Haro
View author publications
You can also search for this author in PubMed Google Scholar
Juan Pablo Posadas-Durán
View author publications
You can also search for this author in PubMed Google Scholar
Liliana Chanona-Hernandez
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Mexican Petroleum Institute (IMP), Eje Central Lazaro Cardenas Norte, 152, Col. San Bartolo Atepehuacan, CP 07730,, Mexico DF,, Mexico
Ildar Batyrshin
National Polytechnic Institute (IPN), Center for Computing Research (CIC), Av. Juan Dios Bátiz, s/n, Col. Nueva Industrial Vallejo, CP 07738, Mexico D.F., Mexico
Grigori Sidorov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sidorov, G., Herrera-de-la-Cruz, J.A., Galicia-Haro, S.N., Posadas-Durán, J.P., Chanona-Hernandez, L. (2011). Heuristic Algorithm for Extraction of Facts Using Relational Model and Syntactic Data. In: Batyrshin, I., Sidorov, G. (eds) Advances in Artificial Intelligence. MICAI 2011. Lecture Notes in Computer Science(), vol 7094. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25324-9_28

Download citation

DOI: https://doi.org/10.1007/978-3-642-25324-9_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25323-2
Online ISBN: 978-3-642-25324-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics