Skip to main content

ITEM: Extract and Integrate Entities from Tabular Data to RDF Knowledge Base

  • Conference paper
Book cover Web Technologies and Applications (APWeb 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6612))

Included in the following conference series:

Abstract

Many RDF Knowledge Bases are created and enlarged by mining and extracting web data. Hence their data sources are limited to social tagging networks, such as Wikipedia, WordNet, IMDB, etc., and their precision is not guaranteed. In this paper, we propose a new system, ITEM, for extracting and integrating entities from tabular data to RDF knowledge base. ITEM can efficiently compute the schema mapping between a table and a KB, and inject novel entities into the KB. Therefore, ITEM can enlarge and improve RDF KB by employing tabular data, which is assumed of high quality. ITEM detects the schema mapping between table and RDF KB only by tuples, rather than the table’s schema information. Experimental results show that our system has high precision and good performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Resource Description Framework (RDF): Concepts and Abstract Syntax, http://www.w3.org/TR/rdf-concepts

  2. McGlothlin, J.P., Khan, L.R.: RDFKB: efficient support for RDF inference queries and knowledge management. In: Proceedings of IDEAS, pp. 259–266 (2009)

    Google Scholar 

  3. Voleti, R., Sperberg, O.R.: Topical Web.: Using RDF for Knowledge Management. Technical Report in XML (2004), http://www.gca.org/xmlusa/2004/slides/sperberg&voleti/UsingRDFforKnowledgeManagement.ppt

  4. Lehigh University Benchmark (LUBM), http://swat.cse.lehigh.edu/projects/lubm.

  5. Suchanek, F.M., Kasneci, G., Weikum, G.: Yago - A Core of Semantic Knowledge. In: 16th international World Wide Web conference (WWW 2007) (2007)

    Google Scholar 

  6. Lenzerini, M.: Data Integration: A Theoretical Perspective. In: ACM Symposium on Principles of Database Systems (PODS), pp. 233–246 (2002)

    Google Scholar 

  7. Rahm, E., Bernstein, P.A.: A Survey of Approaches to Automatic Schema Matching. VLDB Journal 10(4) (2001)

    Google Scholar 

  8. Munkres, J.: Algorithms for the Assignment and Transportation Problems. Journal of the Society for Industrial and Applied Mathematics 5(1), 32–38 (1957)

    Article  MATH  Google Scholar 

  9. Maximal Independent Set Problem, http://en.wikipedia.org/wiki/Maximal_independent_set

  10. Google Fusion Tables, http://www.google.com/fusiontables.

  11. Bergamaschi, S., Castano, S., Vincini, M., Beneventano, D.: Semantic Integration of Heterogeneous Information Sources. Data & Knowledge Engineering 36(3), 215–249 (2001)

    Article  MATH  Google Scholar 

  12. Abadi, D.J., Marcus, A., Madden, S.R., Hollenbach, K.: Using the Barton Libraries Dataset as an RDF Benchmark. MIT-CSAIL-TR-2007-036. MIT (2007)

    Google Scholar 

  13. Aumueller, D., Do, H.-H., Massmann, S., Rahm, E.: Schema and ontology matching with COMA++. In: Proceedings of the ACM SIGMOD (2005)

    Google Scholar 

  14. Engmann, D., Massmann, S.: Instance Matching with COMA++. In: BTW Workshop (2007)

    Google Scholar 

  15. Wang, R.C., Cohen, W.W.: Language-Independent Set Expansion of Named Entities using the Web. In: ICDM 2007 (2007)

    Google Scholar 

  16. Gonzalez, H., Halevy, A., Jensen, C., Langen, A., Madhavan, J., Shapley, R., Shen, W.: Google Fusion Tables: Data Management, Integration and Collaboration in the Cloud. In: SOCC (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Guo, X., Chen, Y., Chen, J., Du, X. (2011). ITEM: Extract and Integrate Entities from Tabular Data to RDF Knowledge Base. In: Du, X., Fan, W., Wang, J., Peng, Z., Sharaf, M.A. (eds) Web Technologies and Applications. APWeb 2011. Lecture Notes in Computer Science, vol 6612. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20291-9_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20291-9_45

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20290-2

  • Online ISBN: 978-3-642-20291-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics