Advertisement

Neural Computing and Applications

, Volume 31, Issue 12, pp 9113–9126 | Cite as

An input information enhanced model for relation extraction

  • Ming Lei
  • Heyan Huang
  • Chong FengEmail author
  • Yang Gao
  • Chao Su
Original Article
  • 51 Downloads

Abstract

We present a novel end-to-end model to jointly extract semantic relations and argument entities from sentence texts. This model does not require any handcrafted feature set or auxiliary toolkit, and hence it could be easily extended to a wide range of sequence tagging tasks. A new method of using the word morphology feature for relation extraction is studied in this paper. We combine the word morphology feature and the semantic feature to enrich the representing capacity of input vectors. Then, an input information enhanced unit is developed for the bidirectional long short-term memory network (Bi-LSTM) to overcome the information loss caused by the gate operations and the concatenation operations in the LSTM memory unit. A new tagging scheme using uncertain labels and a corresponding objective function are exploited to reduce the interference information from non-entity words. Experiments are performed on three datasets: The New York Times (NYT) and ACE2005 datasets for relation extraction and the SemEval 2010 task 8 dataset for relation classification. The results demonstrate that our model achieves a significant improvement over the state-of-the-art model for relation extraction on the NYT dataset and achieves a competitive performance on the ACE2005 dataset.

Keywords

Relation extraction Information extraction Natural language processing Deep learning 

Notes

Acknowledgements

The authors would like to thank Xiang Ren, Zeqiu, Wu and Wenqi He et al. for the public NYT dataset constructed by them. The authors are also grateful to Mikolov et al. for their public program training word embeddings. This research work is supported by the National Key Research and Development Program of China (Grant No. 2017YFB0803302), the National Natural Science Foundation of China (No. 61751201) and the National Key Research and Development Program of China (No. 2016QY03D0602).

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

References

  1. 1.
    Zhang X, Zhao J, LeCun Y, (2015) Character-level convolutional networks for text classification. In: Proceedings of the 28th international conference on neural information processing systems - volume 1, NIPS’15, pp 649–657Google Scholar
  2. 2.
    Chiu J, Nichols E (2016) Named entity recognition with bidirectional LSTM-CNNs. Trans Assoc Comput Linguist 4:357CrossRefGoogle Scholar
  3. 3.
    Cao K, Rei M (2016) A joint model for word embedding and word morphology. In: Proceedings of the 1st workshop on representation learning for NLP (Association for Computational Linguistics, 2016), pp 18–26.  https://doi.org/10.18653/v1/W16-1603. http://www.aclweb.org/anthology/W16-1603
  4. 4.
    Ma X, Hovy E (2016) End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers) (Association for Computational Linguistics, 2016), pp 1064–1074.  https://doi.org/10.18653/v1/P16-1101. http://www.aclweb.org/anthology/P16-1101
  5. 5.
    Miwa M, Bansal M (2016) Modeling joint entity and relation extraction with table representation. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers) (Association for Computational Linguistics, 2016), pp 1105–1116.  https://doi.org/10.18653/v1/P16-1105. http://www.aclweb.org/anthology/P16-1105
  6. 6.
    Zheng S, Wang F, Bao H, Hao Y, Zhou P, Xu B (2017) Joint extraction of entities and relations based on a novel tagging scheme. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers) (Association for Computational Linguistics, 2017), pp 1227–1236.  https://doi.org/10.18653/v1/P17-1113. http://www.aclweb.org/anthology/P17-1113
  7. 7.
    Hochreiter S, Schmidhuber J (1997) Backpropagation applied to handwritten zip code recognition. Neural Comput 9(8):1735.  https://doi.org/10.1162/neco.1997.9.8.1735 CrossRefGoogle Scholar
  8. 8.
    Graves A, Jaitly N, Mohamed AR (2014) Hybrid speech recognition with deep bidirectional LSTM. Automatic speech recognition and understanding IEEE, pp 273–278Google Scholar
  9. 9.
    Hearst MA (1992) Automatic acquisition of hyponyms from large text corpora. In: COLING 1992 volume 2: the 15th international conference on computational linguistics. http://www.aclweb.org/anthology/C92-2082
  10. 10.
    Brin S (1999) Extracting patterns and relations from the World Wide Web. In: Selected papers from the international workshop on The World Wide Web and databases, WebDB ’98. Springer, London. pp 172–183. http://dl.acm.org/citation.cfm?id=646543.696220
  11. 11.
    Agichtein E, Gravano L (2000) Snowball: extracting relations from large plain-text collections. In: Proceedings of the fifth ACM conference on digital libraries, DL ’00. ACM, New York, pp 85–94.  https://doi.org/10.1145/336597.336644
  12. 12.
    Blum A, Lafferty J, Rwebangira MR, Reddy R (2004) Semi-supervised learning using randomized mincuts. In: Proceedings of the twenty-first international conference on machine learning, ICML ’04. ACM, New York, p 13.  https://doi.org/10.1145/1015330.1015429
  13. 13.
    Oakes MP (2005) Using Hearst’s rules for the automatic acquisition of hyponyms for mining a pharmaceutical corpus. In: International workshop text mining research, practice and opportunities, proceedings, Borovets, Bulgaria, 24 September 2005. Held in Conjunction with Ranlp 63–67Google Scholar
  14. 14.
    Chen J, Ji D, Tan C.L, Niu Z (2006) Relation extraction using label propagation based semi-supervised learning. In: Proceedings of the 21st international conference on computational linguistics and 44th annual meeting of the association for computational linguistics (Association for Computational Linguistics, 2006), pp 129–136. http://www.aclweb.org/anthology/P06-1017
  15. 15.
    Bunescu R, Mooney R (2007) Learning to extract relations from the Web using minimal supervision. In: Proceedings of the 45th annual meeting of the association of computational linguistics (Association for Computational Linguistics, 2007), pp 576–583. http://www.aclweb.org/anthology/P07-1073
  16. 16.
    Bollegala DT, Matsuo Y, Ishizuka M (2010) Relational duality: unsupervised extraction of semantic relations between entities on the Web. In: Proceedings of the 19th international conference on World Wide Web, WWW ’10. ACM, New York, pp 151–160.  https://doi.org/10.1145/1772690.1772707
  17. 17.
    Nakashole N, Tylenda T, Weikum G (2013) Fine-grained semantic typing of emerging entities. In: Proceedings of the 51st annual meeting of the association for computational linguistics (volume 1: long papers) (Association for Computational Linguistics, 2013), pp 1488–1497. http://www.aclweb.org/anthology/P13-1146
  18. 18.
    Zelenko D, Aone C, Richardella A (2003) Dropout: a simple way to prevent neural networks from overfitting. Mach Learn Res 3:1083MathSciNetGoogle Scholar
  19. 19.
    Bunescu RC, Mooney RJ (2005) Subsequence kernels for relation extraction. In: Proceedings of the 18th international conference on neural information processing systems, NIPS’05. MIT Press, Cambridge, pp 171–178. http://dl.acm.org/citation.cfm?id=2976248.2976270
  20. 20.
    Qian L, Zhou G, Kong F, Zhu Q, Qian P (2008) Exploiting constituent dependencies for tree kernel-based semantic relation extraction. In: Proceedings of the 22nd international conference on computational linguistics (Coling 2008) (Coling 2008 Organizing Committee, 2008), pp 697–704. http://www.aclweb.org/anthology/C08-1088
  21. 21.
    Xu K, Feng Y, Huang S, Zhao D (2015) Semantic relation classification via convolutional neural networks with simple negative sampling. In: Proceedings of the 2015 conference on empirical methods in natural language processing (Association for Computational Linguistics, 2015), pp 536–540.  https://doi.org/10.18653/v1/D15-1062. http://www.aclweb.org/anthology/D15-1062
  22. 22.
    Zhang H, Sun Y, Zhao M, Chow TWS, Wu QMJ (2019) Understanding subtitles by character-level sequence-to-sequence learning. IEEE Trans Cybern.  https://doi.org/10.1109/TCYB.2019.2900159
  23. 23.
    dos Santos C, Guimarães V (2015) Boosting named entity recognition with neural character embeddings. In: Proceedings of the fifth named entity workshop (Association for Computational Linguistics, 2015), pp 25–33.  https://doi.org/10.18653/v1/W15-3904. http://www.aclweb.org/anthology/W15-3904
  24. 24.
    Zhang H, Li J, Ji Y, Yue H (2017) Understanding subtitles by character-level sequence-to-sequence learning. IEEE Trans Ind Inform 13(2):616.  https://doi.org/10.1109/TII.2016.2601521 CrossRefGoogle Scholar
  25. 25.
    LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Long short-term memory. Neural Comput 1(4):541.  https://doi.org/10.1162/neco.1989.1.4.541 CrossRefGoogle Scholar
  26. 26.
    Xu Y, Mou L, Li G, Chen Y, Peng H, Jin Z (2015) Classifying relations via long short term memory networks along shortest dependency paths. In: Proceedings of the 2015 conference on empirical methods in natural language processing (Association for Computational Linguistics, 2015), pp 1785–1794.  https://doi.org/10.18653/v1/D15-1206
  27. 27.
    Xu Y, Jia R, Mou L, Li G, Chen Y, Lu Y, Jin Z (2016) Improved relation classification by deep recurrent neural networks with data augmentation. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers (The COLING 2016 Organizing Committee, 2016), pp 1461–1470. http://www.aclweb.org/anthology/C16-1138
  28. 28.
    Zhang S, Zheng D, Hu X, Yang M (2015) Bidirectional long short-term memory networks for relation classification. In: Proceedings of the 29th Pacific Asia conference on language, information and computation, pp 73–78. http://www.aclweb.org/anthology/Y15-1009
  29. 29.
    Zeng D, Liu K, Lai S, Zhou G, Zhao J (2014) Relation classification via convolutional deep neural network. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers (Dublin City University and Association for Computational Linguistics, 2014), pp 2335–2344. http://www.aclweb.org/anthology/C14-1220
  30. 30.
    Wang L, Cao Z, de Melo G, Liu Z (2016) Relation classification via multi-level attention CNNs. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers) (Association for Computational Linguistics, 2016), pp 1298–1307.  https://doi.org/10.18653/v1/P16-1123. http://www.aclweb.org/anthology/P16-1123
  31. 31.
    dos Santos C, Xiang B, Zhou B (2015) Classifying relations by ranking with convolutional neural networks. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: long papers) (Association for Computational Linguistics, 2015), pp 626–634.  https://doi.org/10.3115/v1/P15-1061. http://www.aclweb.org/anthology/P15-1061
  32. 32.
    Vu NT, Adel H, Gupta P, Schütze H (2016) Combining recurrent and convolutional neural networks for relation classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies (Association for Computational Linguistics, 2016), pp 534–539.  https://doi.org/10.18653/v1/N16-1065. http://www.aclweb.org/anthology/N16-1065
  33. 33.
    Yang B, Cardie C (2013) Joint inference for fine-grained opinion extraction. In: Proceedings of the 51st annual meeting of the association for computational linguistics (volume 1: long papers) (Association for Computational Linguistics, 2013), pp 1640–1649. http://www.aclweb.org/anthology/P13-1161
  34. 34.
    Singh S, Riedel S, Martin B, Zheng J, McCallum A (2013) Joint inference of entities, relations, and conference. In: Proceedings of the 2013 workshop on automated knowledge base construction, AKBC ’13. ACM, New York, pp 1–6.  https://doi.org/10.1145/2509558.2509559. http://doi.acm.org/10.1145/2509558.2509559
  35. 35.
    Miwa M, Sasaki Y (2014) Modeling joint entity and relation extraction with table representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (Association for Computational Linguistics, 2014), pp 1858–1869.  https://doi.org/10.3115/v1/D14-1200. http://www.aclweb.org/anthology/D14-1200
  36. 36.
    Li Q, Ji H, Incremental Joint Extraction of Entity Mentions and Relations. in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (Association for Computational Linguistics, 2014), pp. 402–412.  https://doi.org/10.3115/v1/P14-1038. http://www.aclweb.org/anthology/P14-1038
  37. 37.
    Mintz M, Bills S, Snow R, Jurafsky D (2009) Distant supervision for relation extraction without labeled data. In: Proceedings of the joint conference of the 47th annual meeting of the acl and the 4th international joint conference on natural language processing of the AFNLP (Association for Computational Linguistics, 2009), pp 1003–1011. http://www.aclweb.org/anthology/P09-1113
  38. 38.
    Riedel S, Yao L, Mccallum A (2010) Modeling relations and their mentions without labeled text. In: European conference on machine learning and knowledge discovery in databases, pp 148–163Google Scholar
  39. 39.
    Hoffmann R, Zhang C, Ling X, Zettlemoyer L, Weld DS (2011) Knowledge-based weak supervision for information extraction of overlapping relations. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies (Association for Computational Linguistics, 2011), pp 541–550. http://www.aclweb.org/anthology/P11-1055
  40. 40.
    Ren X, Wu Z, He W, Qu M, Voss C.R, Ji H, Abdelzaher TF, Han J (2017) CoType: joint extraction of typed entities and relations with knowledge bases. In: Proceedings of the 26th international conference on World Wide Web (International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 2017), WWW ’17, pp 1015–1024.  https://doi.org/10.1145/3038912.3052708. https://doi.org/10.1145/3038912.3052708
  41. 41.
    Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th international conference on neural information processing systems - volume 2 (Curran Associates Inc., USA, 2013), NIPS’13, pp 3111–3119. http://dl.acm.org/citation.cfm?id=2999792.2999959
  42. 42.
    Peng N, Poon H, Quirk C, Toutanova K, Yih W (2017) Cross-sentence N-ary relation extraction with graph LSTMs. Trans Assoc Comput Linguist 5:101CrossRefGoogle Scholar
  43. 43.
    Gormley MR, Yu M, Dredze M (2015) Improved relation extraction with feature-rich compositional embedding models. In: Proceedings of the 2015 conference on empirical methods in natural language processing (Association for Computational Linguistics, 2015), pp 1774–1784.  https://doi.org/10.18653/v1/D15-1205. http://www.aclweb.org/anthology/D15-1205
  44. 44.
    Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) LINE: large-scale information network embedding (WWW World Wide Web Consortium (W3C), 2015). https://www.microsoft.com/en-us/research/publication/line-large-scale-information-network-embedding/
  45. 45.
    Socher R, Huval B, Manning CD, Ng AY (2012) Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning (Association for Computational Linguistics, 2012), pp 1201–1211. http://www.aclweb.org/anthology/D12-1110
  46. 46.
    Kingma D.P, Ba J.L (2015) Adam: A Method for Stochastic Optimization. international conference on learning representationsGoogle Scholar
  47. 47.
    Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  • Ming Lei
    • 1
  • Heyan Huang
    • 1
  • Chong Feng
    • 1
    Email author
  • Yang Gao
    • 1
  • Chao Su
    • 1
  1. 1.Beijing Institute of TechnologyBeijingChina

Personalised recommendations