Personal Attributes Extraction in Chinese Text Based on Distant-Supervision and LSTM

  • Wenxi Yao
  • Jin Liu
  • Zehuan CaiEmail author
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 474)


In this paper, we proposed a distant-supervision approach to solve the problem of insufficient training corpus for extracting attribute from the unstructured text, by using the wiki infobox information to tag the Wikipedia text to get the training corpus. We consider the extract attribute as the sequence annotation question and use the wiki personal text as the training corpus. The clp-2014 task4 is used as the test corpus to test. The experiment result show that this method can enhance the quality of the attribute extraction.


Deep learning Entity attribute extraction LSTM  Sequence padding NLP Distant-supervised 


  1. 1.
    Sánchez, D.: A methodology to learn ontological attributes from the Web. Data Knowl. Eng. 69(6), 573–597 (2010)CrossRefGoogle Scholar
  2. 2.
    Wu, F., Weld, D.S.: Autonomously semantifying wikipedia. In: Sixteenth ACM Conference on Information and Knowledge Management, CIKM 2007, Lisbon, Portugal, pp. 41–50, November 2007Google Scholar
  3. 3.
    Bunescu, R.C., Mooney, R.J.: Learning to extract relations from the web using minimal supervision. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL 2007) (2007)Google Scholar
  4. 4.
    Mintz, M., Bills, S., Snow, R., et al.: Distant supervision for relation extraction without labeled data (2009)Google Scholar
  5. 5.
    Hoffmann, R., Zhang, C., Ling, X., et al.: Knowledge-based weak supervision for information extraction of overlapping relations. In: Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 541–550. Association for Computational Linguistics (2011)Google Scholar
  6. 6.
    Xian, Y., Shao, F., Guo, J., et al.: Using deep belief networks to extract Chinese entity attribute relation in domain-specific. Int. J. Comput. Sci. Math. 7(2), 144–155 (2016)CrossRefGoogle Scholar
  7. 7.
    Yu, D., Liu, C.H., et al.: Personal title and career attributes extraction based on distant supervision and pattern matching. Comput. Sci. (2014)Google Scholar
  8. 8.
    Li, H.L.: Research on Character Attributes Extraction Based on Rules from Baidu Encyclopedia. Southwest Jiaotong University, Chengdu (2013)Google Scholar
  9. 9.
    Chen, L., Feng, Y.: Extracting relations from the web via weakly supervised learning. J. Comput. Res. Devel. 50(9), 1825–1835 (2013)Google Scholar
  10. 10.
    Liu, L.: Domain concepts entity attribute relation extraction based on LM algorithm. J. Chin. Inf. Process.Google Scholar
  11. 11.
    Liu, L., Li, B., Zhang, X.: Named entity relation extraction based on SVM training by positive and negative cases. J. Comput. Appl. 28(6), 1444–1446 (2008)zbMATHGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.College of Information EngineeringShanghai Maritime UniversityShanghaiChina

Personalised recommendations