Skip to main content

Detect Missing Attributes for Entities in Knowledge Bases via Hierarchical Clustering

  • Conference paper
Natural Language Processing and Chinese Computing (NLPCC 2014)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 496))

  • 1790 Accesses

Abstract

Automatically constructed knowledge bases often suffer from quality issues such as the lack of attributes for existing entities. Manually finding and filling missing attributes is time consuming and expensive since the volume of knowledge base is growing in an unforeseen speed. We, therefore, propose an automatic approach to suggest missing attributes for entities via hierarchical clustering based on the intuition that similar entities may share a similar group of attributes. We evaluate our method on a randomly sampled set of 20,000 entities from DBPedia. The experimental results show that our method can achieve a high precision and outperform existing methods.

This work was supported by the National High Technology R&D Program of China (Grant No. 2012AA011101, 2014AA015102), National Natural Science Foundation of China (Grant No. 61272344, 61202233, 61370055) and the joint project with IBM Research. Corresponding author: Yansong Feng.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abedjan, Z., Naumann, F.: Improving rdf data through association rule mining. Datenbank-Spektrum 13(2), 111–120 (2013)

    Article  Google Scholar 

  2. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv: 1301.3781 (2013)

    Google Scholar 

  3. Suchanek, F., Weikum, G.: Knowledge harvesting in the big-data era. In: Proceedings of the 2013 International Conference on Management of Data, pp. 933–938 (2013)

    Google Scholar 

  4. Xu, X., Yuruk, N., Feng, Z., Schweiger, T.A.: Scan: A structural clustering algorithm for networks. In: Proceedings of the 13th ACM SIGKDD, pp. 824–833 (2007)

    Google Scholar 

  5. Grzymala-Bussem, J.W., Grzymala-Busse, W.J.: Handling Missing Attribute Values. In: Data Mining and Knowledge Discovery Handbook, pp. 33–51 (2010)

    Google Scholar 

  6. Wong, Y.W., Widdows, D., Lokovic, T., Nigam, K.: Scalable Attribute-Value Extraction from Semi-Structured Text. IEEE International Conference on Data Mining Workshops (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Luo, B., Lu, H., Diao, Y., Feng, Y., Zhao, D. (2014). Detect Missing Attributes for Entities in Knowledge Bases via Hierarchical Clustering. In: Zong, C., Nie, JY., Zhao, D., Feng, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2014. Communications in Computer and Information Science, vol 496. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45924-9_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-45924-9_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-45923-2

  • Online ISBN: 978-3-662-45924-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics