Extracting Frame-Like Structures from Google Books NGram Dataset

Ivanov, Vladimir

doi:10.1007/978-3-319-13647-9_3

Vladimir Ivanov^22,23,24

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8856))

Included in the following conference series:

Mexican International Conference on Artificial Intelligence

1738 Accesses

Abstract

We propose a method that facilitates a process of semi-automatic FrameNet construction. The method requires Google Books NGram dataset and WordNet or another thesaurus for a particular language. We evaluated the method for Russian ngrams. Due to a huge amount of available data the method does not require sophisticated natural language processing techniques (e.g. for word sense disambiguation), and it shows a promising result.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Baker, C.F., Fillmore, C.J., Cronin, B.: The structure of the framenet database. International Journal of Lexicography 16(3), 281–296 (2003)
Article Google Scholar
Castro-Sánchez, N.A., Sidorov, G.: Analysis of definitions of verbs in an explanatory dictionary for automatic extraction of actants based on detection of patterns. In: Hopfe, C.J., Rezgui, Y., Métais, E., Preece, A., Li, H. (eds.) NLDB 2010. LNCS, vol. 6177, pp. 233–239. Springer, Heidelberg (2010)
Chapter Google Scholar
Koeva, S.: Lexicon and grammar in bulgarian framenet. In: LREC (2010)
Google Scholar
Lin, Y., Michel, J.-B., Aiden, E.L., Orwant, J., Brockman, W., Petrov, S.: Syntactic annotations for the google books ngram corpus. In: Proceedings of the ACL 2012 System Demonstrations, pp. 169–174. Association for Computational Linguistics (2012)
Google Scholar
Loukachevitch, N., Dobrov, B.: Ruthes linguistic ontology vs. russian wordnets. In: Proceedings of Global WordNet Conference GWC-2014, Tartu (2014)
Google Scholar
Lyashevskaya, O.: Dictionary of valencies meets corpus annotation: A case of russian framebank. Proceedings of EURALEX 15 (2012)
Google Scholar
Palmer, M., Gildea, D., Kingsbury, P.: The proposition bank: An annotated corpus of semantic roles. Computational linguistics 31(1), 71–106 (2005)
Article Google Scholar
Kochetkova, N.A., Klyshinsky, E.S.: A method of automatic generating of russian verb subordination models. In: Proceedings of the In XII National Conference of Artificial Intelligence (2013)
Google Scholar
Schuler, K.K.: Verbnet: A broad-coverage, comprehensive verb lexicon (2005)
Google Scholar
Sidorov, G.: Syntactic dependency based n-grams in rule based automatic english as second language grammar correction. International Journal of Computational Linguistics and Applications 4(2), 169–188 (2013)
Google Scholar
Sidorov, G., Velasquez, F., Stamatatos, E., Gelbukh, A., Chanona-Hernández, L.: Syntactic n-grams as machine learning features for natural language processing. Expert Systems with Applications 41(3), 853–860 (2014)
Article Google Scholar
Tonelli, S., Pianta, E.: Frame information transfer from english to italian. In: LREC (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Kazan Federal University, 420008, Kazan, Kremlevskaya st., 18, Russia
Vladimir Ivanov
National University of Science and Technology ”MISIS”, 119049, Moscow, Leninskiy pr., 4, Russia
Vladimir Ivanov
Institute of Informatics, Tatarstan Academy of Sciences, Levoboulachnaya St., 36a, Kazan, Russia
Vladimir Ivanov

Authors

Vladimir Ivanov
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centro de Investigación en Computación, Instituto Politécnico Nacional, Av. Juan Dios Bátiz s/n, Col. Nueva Industrial Vallejo, 07738, Mexico City, Mexico
Alexander Gelbukh
Área Académica de Computación y Electrónica, Carretera Pachuca-Tulancingo, Universidad Autónoma del Estado de Hidalgo, Km. 4.5, Col. Carboneras, Mineral de la Reforma, 42180, Hidalgo, Mexico
Félix Castro Espinoza
Facultad de ciencias, Universidad Autónoma Nacional de México, Ciudad Universitaria, México DF, Mexico
Sofía N. Galicia-Haro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ivanov, V. (2014). Extracting Frame-Like Structures from Google Books NGram Dataset. In: Gelbukh, A., Espinoza, F.C., Galicia-Haro, S.N. (eds) Human-Inspired Computing and Its Applications. MICAI 2014. Lecture Notes in Computer Science(), vol 8856. Springer, Cham. https://doi.org/10.1007/978-3-319-13647-9_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-13647-9_3
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13646-2
Online ISBN: 978-3-319-13647-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics