Detect Missing Attributes for Entities in Knowledge Bases via Hierarchical Clustering

Luo, Bingfeng; Lu, Huanquan; Diao, Yigang; Feng, Yansong; Zhao, Dongyan

doi:10.1007/978-3-662-45924-9_35

Bingfeng Luo¹⁶,
Huanquan Lu¹⁶,
Yigang Diao¹⁷,
Yansong Feng¹⁸ &
…
Dongyan Zhao¹⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 496))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

1790 Accesses

Abstract

Automatically constructed knowledge bases often suffer from quality issues such as the lack of attributes for existing entities. Manually finding and filling missing attributes is time consuming and expensive since the volume of knowledge base is growing in an unforeseen speed. We, therefore, propose an automatic approach to suggest missing attributes for entities via hierarchical clustering based on the intuition that similar entities may share a similar group of attributes. We evaluate our method on a randomly sampled set of 20,000 entities from DBPedia. The experimental results show that our method can achieve a high precision and outperform existing methods.

This work was supported by the National High Technology R&D Program of China (Grant No. 2012AA011101, 2014AA015102), National Natural Science Foundation of China (Grant No. 61272344, 61202233, 61370055) and the joint project with IBM Research. Corresponding author: Yansong Feng.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abedjan, Z., Naumann, F.: Improving rdf data through association rule mining. Datenbank-Spektrum 13(2), 111–120 (2013)
Article Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv: 1301.3781 (2013)
Google Scholar
Suchanek, F., Weikum, G.: Knowledge harvesting in the big-data era. In: Proceedings of the 2013 International Conference on Management of Data, pp. 933–938 (2013)
Google Scholar
Xu, X., Yuruk, N., Feng, Z., Schweiger, T.A.: Scan: A structural clustering algorithm for networks. In: Proceedings of the 13th ACM SIGKDD, pp. 824–833 (2007)
Google Scholar
Grzymala-Bussem, J.W., Grzymala-Busse, W.J.: Handling Missing Attribute Values. In: Data Mining and Knowledge Discovery Handbook, pp. 33–51 (2010)
Google Scholar
Wong, Y.W., Widdows, D., Lokovic, T., Nigam, K.: Scalable Attribute-Value Extraction from Semi-Structured Text. IEEE International Conference on Data Mining Workshops (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Machine Intelligence, Peking University, Peking, China
Bingfeng Luo & Huanquan Lu
Technic and Commnication Technology Bureau, Xinhua News Agency, China
Yigang Diao
ICST, Peking University, Peking, China
Yansong Feng & Dongyan Zhao

Authors

Bingfeng Luo
View author publications
You can also search for this author in PubMed Google Scholar
Huanquan Lu
View author publications
You can also search for this author in PubMed Google Scholar
Yigang Diao
View author publications
You can also search for this author in PubMed Google Scholar
Yansong Feng
View author publications
You can also search for this author in PubMed Google Scholar
Dongyan Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, 100190, Beijing, China
Chengqing Zong
Dept. of Computer Science and Operations Research, University of Montreal, Montreal, Quebec, Canada
Jian-Yun Nie
Peking University, Beijing, China
Dongyan Zhao
Institute of Computer Science & Technology, Peking University, 100871, Beijing, China
Yansong Feng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Luo, B., Lu, H., Diao, Y., Feng, Y., Zhao, D. (2014). Detect Missing Attributes for Entities in Knowledge Bases via Hierarchical Clustering. In: Zong, C., Nie, JY., Zhao, D., Feng, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2014. Communications in Computer and Information Science, vol 496. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45924-9_35

Download citation

DOI: https://doi.org/10.1007/978-3-662-45924-9_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45923-2
Online ISBN: 978-3-662-45924-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics