Advertisement

Effective Chinese Organization Name Linking to a List-Like Knowledge Base

  • Chengyuan Xue
  • Haofen WangEmail author
  • Bo Jin
  • Mengjie Wang
  • Daqi Gao
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 480)

Abstract

Entity Linking is widely used in entity retrieval and semantic search. It refers mentions in unstructured documents to their representations in a knowledge base (KB). The frequently used KB (e.g. Wikipedia) usually contains abundant information corresponding to each entity, such as properties, name variations and text descriptions, which can help to find candidates and disambiguate the links. In this paper, we link organization names in Chinese documents to a list-like KB. Compared to typical KBs, the records in our KB are simply Chinese organization full names. The massive variations, or abbreviations in the documents cannot be directly matched to any organization name in the KB and bring about ambiguities, thus make the linking task difficult. At first, we enrich the KB with the abbreviations. Making use of the information from Hudong Baike and other sources, we design a pattern based full name annotation method to help generate abbreviations for all the names in the KB. To resolve the ambiguity problem, we propose a two-stage linking generation approach utilizing the co-occurrence of abbreviations and full names in the same document or document cluster, where the linked full names in the first stage constraint the linking of abbreviations in the second stage. We apply our approach to police inquiry document corpus. The experiment results show the effectiveness of our approach and outperforms the one-stage approach significantly in terms of precision and recall.

Keywords

Knowledge Base Core Part Name Entity Recognition Document Cluster Count Repeat 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

This work is funded by The 3rd Research Institute of The Ministry of Public Security through project No: C13601. We thank Tong Ruan for the guidance of the project, and thank Chen Wang for her proofreading.

References

  1. 1.
    Zhong, L.W., Zheng, F.: Study on approach to retrieval of chinese organization name based on its abbreviated name. J. Chin. Inf. Process. 21, 38–42 (2007)Google Scholar
  2. 2.
    Chua, T.S., Liu, J.: Learning pattern rules for chinese named entity extraction. In: Proceedings of AAAI/IAAI, 411–418 (2002)Google Scholar
  3. 3.
    Houfeng, W., Wuguang, S.: A simple rule-based approach to organization name recognition in chinese text. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 769–772. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  4. 4.
    Ling, Y., Yang, J., He, L.: Chinese organization name recognition based on multiple features. In: Chau, M., Wang, G., Yue, W.T., Chen, H. (eds.) PAISI 2012. LNCS, vol. 7299, pp. 136–144. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  5. 5.
    Fu, C., Fu, G.: A dual-layer CRFs based method for chinese nested named entity recognition. In: 9th International Conference on Fuzzy Systems and Knowledge Discovery, pp. 2546–2550. IEEE, New York (2012)Google Scholar
  6. 6.
    Wu, X., Wu, Z., Jia, J., et al.: Adaptive named entity recognition based on conditional random fields with automatic updated dynamic gazetteers. In: 8th International Symposium on Chinese Spoken Language Processing, pp. 363–367. IEEE, New York (2012)Google Scholar
  7. 7.
    Zhang, W., Su, J., Tan, C.L. et al.: Entity linking leveraging: automatically generated annotation. In: COLING 2010, pp. 1290–1298. ACL, Stroudsburg (2010)Google Scholar
  8. 8.
    Han, X., Sun, L., Zhao, J.: Collective entity linking in web text: a graph-based method. In: 34th ACM SIGIR International Conference on Research and Development in Information Retrieval, pp. 765–774. ACM, New York (2011)Google Scholar
  9. 9.
    Liu, X., Li, Y., Wu, H., et al.: Entity linking for tweets. In: The 51th Annual Meeting of the Association for Computational Linguistics, pp. 1304–1311. ACL, Stroudsburg (2013)Google Scholar
  10. 10.
    Shen, W., Wang, J., Luo, P., et al.: LIEGE: link entities in web lists with knowledge base. In: The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1424–1432. ACM, New York (2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Chengyuan Xue
    • 1
  • Haofen Wang
    • 1
    Email author
  • Bo Jin
    • 2
  • Mengjie Wang
    • 1
  • Daqi Gao
    • 1
  1. 1.East China University of Science and TechnologyShanghaiChina
  2. 2.The 3rd Research Institute of the Public Security MinistryShanghaiChina

Personalised recommendations