An Improved Discriminative Category Matching in Relation Identification

  • Yongliang Sun
  • Jing Yang
  • Xin Lin
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7934)


This paper describes an improved method for relation identification, which is the last step of unsupervised relation extraction. Similar entity pairs maybe grouped into the same cluster. It is also important to select a key word to describe the relation accurately. Therefore, an improved DF feature selection method is employed to rearrange low-frequency entity pairs’ features in order to get a feature set for each cluster. Then we used an improved Discriminative Category Matching (DCM) method to select typical and discriminative words for entity pairs’ relation. Our experimental results show that Improved DCM method is better than the original DCM method in relation identification.


Unsupervised Relation Extraction Improved DF Low-frequency entity pair Improved DCM 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hasegawa, T., Sekine, S., Grishman, R.: Discovering Relations among Named Entities from Large Corpora. In: ACL 2004 (2004)Google Scholar
  2. 2.
    Chen, J., Ji, D., Tan, C.L., Niu, Z.: Unsupervised Feature Selection for Relation Extraction. In: IJCNLP 2005, JejuIsland, Korea (2005)Google Scholar
  3. 3.
    Benjamin, R., Ronen, F.: Clustering for Unsupervised Relation Identification. In: Proceedings of CIKM 2007 (2007)Google Scholar
  4. 4.
    Wang, J.: Research on Unsupervised Chinese Entity Relation Extraction Method, East China Normal University (2012)Google Scholar
  5. 5.
    Yan, Y., Naoaki, O., Yutaka, M., Yang, Z., Mitsuru, I.: Unsupervised relation extraction by mining Wikipedia texts using information from the web. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Suntec, Singapore, August 2-7, vol. 2 (2009)Google Scholar
  6. 6.
    Zhou, S., Xu, Z., Xu, T.: New method for determining optimal number of clusters in K-means clustering algorithm. Computer Engineering and Applications 46(16), 27–31 (2010)Google Scholar
  7. 7.
    Dudoit, S., Fridlyand, J.: A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biology 3(7), 1–21 (2002)CrossRefGoogle Scholar
  8. 8.
    Xu, Y., LI, J., Wang, B., Sun, C.: A study of Feature Selection for Text Categorization Base on Term Frequency. In: Chinese Information Processing Front Progress China Chinese Information Society 25th Anniversary of Academic Conference Proceedings (2006) Google Scholar
  9. 9.
    Xu, Y., Huai, J., Wang, Z.: Reduction Algorithm Based on Discernibility and Its Applications. Chinese Journal of Computers 26(1) (January 2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Yongliang Sun
    • 1
  • Jing Yang
    • 1
  • Xin Lin
    • 1
  1. 1.Department of Computer Science and TechnologyEast China Normal UniversityChina

Personalised recommendations