Abstract
Open Information Extraction systems, such as ReVerb, OLLIE, Clause IE, OpenIE 4.2, Sanford OIE, and PredPatt, have attracted much attention on English OIE. However, few studies have been reported on OIE for languages beyond English. This paper presents a Chinese OIE system PLCOIE to extract binary relation triples and N-ary relation tuples from Chinese documents. Our goal is to learn general patterns that is composed of both dependency parsing roles and parts of speech from large corpus, and the learned patterns are used to extract relation tuples from documents. In addition, this paper alleviates trans-classed word issue and light verb construction issue. PLCOIE can extract binary relation triples as well as N-ary relation tuples, and experiments on four real-world data sets show that the results are more precise than state-of-the-art Chinese OIE systems, which indicate that PLCOIE is feasible and effective.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction for the web. In: 20th International Proceedings on International Joint Conferences on Artificial Intelligence, pp. 2670–2676. University of Washington, Seattle (2007)
Tseng, Y.H., et al.: Chinese open relation extraction for knowledge acquisition. In: 14th International Proceedings on European Chapter of the Association for Computational Linguistics, pp. 12–16. ACL, Stroudsburg (2014)
Zhang, H., Zheng, J.: A study on consistency checking method of part-of-speech tagging for chinese corpora1. IJCLCLP 13(2), 157–169 (2008)
Zhang, H., Zheng, J.H., Zhao, Y.: A classification-based algorithm for consistency check of part-of-speech tagging for Chinese corpora. J. Vis. Exp. Jove pii(16), e722–e722 (2008)
Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: 16th Conference on Empirical Methods in Natural Language Processing, pp. 1535–1545. ACL, Stroudsburg (2011)
Butt, M.: The light verb jungle. Workshop on Multi 9 (2003)
Mesquita, F., Schmidek, J., Barbosa, D.: Effectiveness and efficiency of open relation extraction. In: 18th Conference on Empirical Methods in Natural Language Processing, pp. 225–252. ACL, Stroudsburg (2013)
Gamallo, P., Garcia, M., Ndez-Lanza, S.: Dependency-based open information extraction. In: 18th Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP at the Conference of the European Chapter of the Association for Computational Linguistics, 10–18. ACL, Stroudsburg (2013)
Mausam, S.M., Bart, R., Soderland, S., Etzioni, O.: Open language learning for information extraction. In: 17th Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. ACL, Stroudsburg (2012)
Del Gemulla, L., Corro, R.: Clausie: clause-based open information extraction, pp. 355–366 (2013)
Angeli, G., Premkumar, M.J.J., Manning, C.D.: Leveraging linguistic structure for open domain information extraction. In: 53th Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 344–354. ACL, Stroudsburg (2015)
White, A.S., et al.: Universal decompositional semantics on universal dependencies. In: 21th Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1713–1723. ACL, Stroudsburg (2016)
Christensen, J., Soderland, S., Etzioni, O., et al.: An analysis of open information extraction based on semantic role labeling. In: 6th Proceedings of the sixth international conference on Knowledge capture, pp. 113–120. ACM, New York (2011)
Qiu, L., Zhang, Y.: ZORE: a syntax-based system for Chinese open relation extraction. In: 19th Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1870–1880. ACL, Stroudsburg (2014)
Sun, M., Li, X., Wang, X., Fan, M., Feng, Y., Li, P.: Logician: a unified end-to-end neural approach for open-domain information extraction. In: 11th Eleventh ACM International Conference on Web Search and Data Mining. ACM, New York (2018)
Zhang, Y., Clark, S.: Syntactic processing using the generalized perceptron and beam search. Comput. Linguist. 37(1), 105–151 (2011)
Li, M., Ma, B., Wang, L.: On the closest string and substring problems. J. ACM 49(2), 157–171 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Li, Y., Miao, Q., Guo, T., Geng, J., Hu, C., Xu, F. (2019). Pattern Learning for Chinese Open Information Extraction. In: Zhao, J., Harmelen, F., Tang, J., Han, X., Wang, Q., Li, X. (eds) Knowledge Graph and Semantic Computing. Knowledge Computing and Language Understanding. CCKS 2018. Communications in Computer and Information Science, vol 957. Springer, Singapore. https://doi.org/10.1007/978-981-13-3146-6_7
Download citation
DOI: https://doi.org/10.1007/978-981-13-3146-6_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-3145-9
Online ISBN: 978-981-13-3146-6
eBook Packages: Computer ScienceComputer Science (R0)