Advertisement

From Open Information Extraction to Semantic Web: A Context Rule-Based Strategy

  • Julio Hernandez
  • Ivan Lopez-Arevalo
  • Jose L. Martinez-Rodriguez
  • Edwyn Aldana-Bobadilla
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11308)

Abstract

The Web represents a valuable data source of information that is presented mainly as unstructured text. The extraction of structured and valuable information from sources such as the Web is an important challenge for the Semantic Web and Information Extraction areas, where elements representing real-world objects (aka named entities) and their relations need to be extracted from text and formally represented through RDF triples. Thus, extracting such information from the Web is manually unfeasible due to its large scale and heterogeneity of domains. In this sense, Open Information Extraction (OIE) is an independent domain task based on patterns to extract any kind of relation between named entities. Hence, one step further is to transform such relations into RDF triples. This paper proposes a method to represent relations obtained by an OIE approach into RDF triples. The method is based on the extraction of named entities, their relation, and contextual information from an input sentence and a set of defined rules that lead to map the extracted elements with resources from a Knowledge Base of the Semantic Web. The evaluation demonstrates promising results regarding the extraction and representation of information.

Keywords

Open Information Extraction Semantic Web Named entity recognition Named entity linking 

References

  1. 1.
    Augenstein, I., Maynard, D., Ciravegna, F.: Relation extraction from the web using distant supervision. In: Janowicz, K., Schlobach, S., Lambrix, P., Hyvönen, E. (eds.) EKAW 2014. LNCS (LNAI), vol. 8876, pp. 26–41. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-13704-9_3CrossRefGoogle Scholar
  2. 2.
    Augenstein, I., Padó, S., Rudolph, S.: LODifier: generating linked data from unstructured text. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 210–224. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-30284-8_21CrossRefGoogle Scholar
  3. 3.
    Baker, C.: FrameNet: a knowledge base for natural language processing. In: Proceedings of Frame Semantics in NLP: A Workshop in Honor of Chuck Fillmore (1929–2014), pp. 1–5. Association for Computational Linguistics (2014)Google Scholar
  4. 4.
    Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Sci. Am. 284(5), 34–43 (2001)CrossRefGoogle Scholar
  5. 5.
    Corcoglioniti, F., Rospocher, M., Palmero Aprosio, A.: Frame-based ontology population with pikes. IEEE Trans. Knowl. Data Eng. 28(12), 3261–3275 (2016)CrossRefGoogle Scholar
  6. 6.
    Das, D., Schneider, N., Chen, D., Smith, N.A.: Probabilistic frame-semantic parsing. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT 2010, pp. 948–956. Association for Computational Linguistics, Stroudsburg (2010)Google Scholar
  7. 7.
    Del Corro, L., Gemulla, R.: ClausIE: clause-based open information extraction. In: Proceedings of the 22nd International Conference on World Wide Web, WWW 2013, pp. 355–366. ACM, New York (2013)Google Scholar
  8. 8.
    Etzioni, O., Banko, M., Soderland, S., Weld, D.S.: Open information extraction from the web. Commun. ACM 51(12), 68–74 (2008)CrossRefGoogle Scholar
  9. 9.
    Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference on EMNLP, pp. 1535–1545. Association for Computational Linguistics, Stroudsburg (2011)Google Scholar
  10. 10.
    Gangemi, A., Presutti, V., Recupero, D.R., Nuzzolese, A.G., Draicchio, F., MongiovÃ, M.: Semantic web machine reading with FRED. Semant. Web 8(6), 873–893 (2017)CrossRefGoogle Scholar
  11. 11.
    Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, 2nd edn. Prentice Hall Series in Artificial Intelligence. Prentice Hall, Pearson Education International (2009)Google Scholar
  12. 12.
    Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: ACL System Demonstrations, pp. 55–60 (2014)Google Scholar
  13. 13.
    Piskorski, J., Yangarber, R.: Information extraction: past, present and future. In: Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R. (eds.) Multi-source, Multilingual Information Extraction and Summarization Theory and Applications of Natural Language Processing, pp. 23–49. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-28569-1_2CrossRefGoogle Scholar
  14. 14.
    Zouaq, A., Gagnon, M., Jean-Louis, L.: An assessment of open relation extraction systems for the semantic web. Inf. Syst. 71, 228–239 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Julio Hernandez
    • 1
  • Ivan Lopez-Arevalo
    • 1
  • Jose L. Martinez-Rodriguez
    • 1
  • Edwyn Aldana-Bobadilla
    • 2
  1. 1.Cinvestav TamaulipasCiudad VictoriaMexico
  2. 2.Conacyt-CinvestavMexico CityMexico

Personalised recommendations