Skip to main content

RCE-OIE: Open Information Extraction Using a Rule-Based Clause Extraction Engine for Semantic Applications

  • Conference paper
  • First Online:
  • 704 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 709))

Abstract

Open Information Extraction (OIE) is a process of extracting clauses present in the text. Extraction of clauses is useful for several applications. However, the existing OIE methods do not focus on the improvement of such applications. In this paper, we present a methodology for OIE using a rule-based clause extraction engine (RCE-OIE) by considering some aspects like handling of coordinating conjunctions, negations, and relative clauses for the improvement of semantic applications. We have evaluated RCE-OIE on OIE datasets to show that our clause extraction approach is domain-independent and comparable with the state-of-the-art OIE systems. Our RCE-OIE is capable of improving the performance of downstream applications. In particular, RCE-OIE significantly improves the performance of paraphrase identification on Microsoft Research corpus when compared with the existing OIE systems.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://corenlp.run/.

References

  1. Akbik, A., Loser, A.: Kraken: N-ary facts in open information extraction. In: Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction, pp. 52–56 (2012)

    Google Scholar 

  2. Angeli, G., Premkumar, M J., Manning, C.D.: Leveraging linguistic structure for open domain information extraction. In: Proceedings of the ACL, pp. 1–11 (2015)

    Google Scholar 

  3. Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction for the web. IJCAI 7, 2670–2676 (2007)

    Google Scholar 

  4. Del Corro, L., Gemulla, R.: ClausIE: clause-based open information extraction. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 355–366 (2013)

    Google Scholar 

  5. Dolan, B., Quirk, C., Brockett, C.: Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources. In: Proceedings of the 20th International Conference on Computational Linguistics, p. 350 (2004)

    Google Scholar 

  6. Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in NLP, pp. 1535–1545 (2011)

    Google Scholar 

  7. Furbach, U., Glockner, I., Helbig, H., Pelzer, B.: LogAnswer—a deduction-based question answering system (system description). In: International Joint Conference on Automated Reasoning, pp. 139–146. Springer (2008)

    Google Scholar 

  8. Hovy, E., Lin, C.: Automated text summarization and the summarist system. In: Proceedings of a Workshop on Held at Baltimore, Maryland, 13–15 Oct 1998, pp. 197–214. Association for Computational Linguistics (1998)

    Google Scholar 

  9. Madnani, N., Tetreault, J., Chodorow, M.: Re-examining machine translation metrics for paraphrase identification. In: Proceedings of the 2012 Conference of the North American Chapter of ACL: Human Language Technologies, pp. 182–190 (2012)

    Google Scholar 

  10. McNemar, Q.: Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12(2), 153–157 (1947)

    Google Scholar 

  11. Schmitz, M., Bart, R., Soderland, S., Etzioni, O.: Open language learning for information extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in NLP and Computational Natural Language Learning, pp. 523–534 (2012)

    Google Scholar 

  12. Thenmozhi, D., Aravindan, C.: An automatic and clause based approach to learn relations for ontologies. Comput. J. 59(6), 889–907 (2016). https://doi.org/10.1093/comjnl/bxv071

  13. Thenmozhi, D., Aravindan, C.: Paraphrase identification by using clause based similarity features and machine translation metrics. Comput. J. 59(9), 1289–1302 (2016). https://doi.org/10.1093/comjnl/bxv083

  14. Wu, F., Weld, D.S.: Open information extraction using Wikipedia. In: Proceedings of the 48th Annual Meeting of the ACL, pp. 118–127 (2010)

    Google Scholar 

  15. Zouaq, A.: An overview of shallow and deep natural language processing for ontology learning. In: Ontology Learning and Knowledge Discovery Using the Web: Challenges and Recent Advances, vol. 2, pp. 16–37 (2011)

    Google Scholar 

Download references

Acknowledgements

We would like to thank the management of SSN Institutions for funding the High Performance Computing (HPC) lab where this research is being carried out.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. Thenmozhi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Thenmozhi, D., Aravindan, C. (2018). RCE-OIE: Open Information Extraction Using a Rule-Based Clause Extraction Engine for Semantic Applications. In: Sa, P., Bakshi, S., Hatzilygeroudis, I., Sahoo, M. (eds) Recent Findings in Intelligent Computing Techniques . Advances in Intelligent Systems and Computing, vol 709. Springer, Singapore. https://doi.org/10.1007/978-981-10-8633-5_20

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-8633-5_20

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-8632-8

  • Online ISBN: 978-981-10-8633-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics