Abstract
Many RDF Knowledge Bases are created and enlarged by mining and extracting web data. Hence their data sources are limited to social tagging networks, such as Wikipedia, WordNet, IMDB, etc., and their precision is not guaranteed. In this paper, we propose a new system, ITEM, for extracting and integrating entities from tabular data to RDF knowledge base. ITEM can efficiently compute the schema mapping between a table and a KB, and inject novel entities into the KB. Therefore, ITEM can enlarge and improve RDF KB by employing tabular data, which is assumed of high quality. ITEM detects the schema mapping between table and RDF KB only by tuples, rather than the table’s schema information. Experimental results show that our system has high precision and good performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Resource Description Framework (RDF): Concepts and Abstract Syntax, http://www.w3.org/TR/rdf-concepts
McGlothlin, J.P., Khan, L.R.: RDFKB: efficient support for RDF inference queries and knowledge management. In: Proceedings of IDEAS, pp. 259–266 (2009)
Voleti, R., Sperberg, O.R.: Topical Web.: Using RDF for Knowledge Management. Technical Report in XML (2004), http://www.gca.org/xmlusa/2004/slides/sperberg&voleti/UsingRDFforKnowledgeManagement.ppt
Lehigh University Benchmark (LUBM), http://swat.cse.lehigh.edu/projects/lubm.
Suchanek, F.M., Kasneci, G., Weikum, G.: Yago - A Core of Semantic Knowledge. In: 16th international World Wide Web conference (WWW 2007) (2007)
Lenzerini, M.: Data Integration: A Theoretical Perspective. In: ACM Symposium on Principles of Database Systems (PODS), pp. 233–246 (2002)
Rahm, E., Bernstein, P.A.: A Survey of Approaches to Automatic Schema Matching. VLDB Journal 10(4) (2001)
Munkres, J.: Algorithms for the Assignment and Transportation Problems. Journal of the Society for Industrial and Applied Mathematics 5(1), 32–38 (1957)
Maximal Independent Set Problem, http://en.wikipedia.org/wiki/Maximal_independent_set
Google Fusion Tables, http://www.google.com/fusiontables.
Bergamaschi, S., Castano, S., Vincini, M., Beneventano, D.: Semantic Integration of Heterogeneous Information Sources. Data & Knowledge Engineering 36(3), 215–249 (2001)
Abadi, D.J., Marcus, A., Madden, S.R., Hollenbach, K.: Using the Barton Libraries Dataset as an RDF Benchmark. MIT-CSAIL-TR-2007-036. MIT (2007)
Aumueller, D., Do, H.-H., Massmann, S., Rahm, E.: Schema and ontology matching with COMA++. In: Proceedings of the ACM SIGMOD (2005)
Engmann, D., Massmann, S.: Instance Matching with COMA++. In: BTW Workshop (2007)
Wang, R.C., Cohen, W.W.: Language-Independent Set Expansion of Named Entities using the Web. In: ICDM 2007 (2007)
Gonzalez, H., Halevy, A., Jensen, C., Langen, A., Madhavan, J., Shapley, R., Shen, W.: Google Fusion Tables: Data Management, Integration and Collaboration in the Cloud. In: SOCC (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Guo, X., Chen, Y., Chen, J., Du, X. (2011). ITEM: Extract and Integrate Entities from Tabular Data to RDF Knowledge Base. In: Du, X., Fan, W., Wang, J., Peng, Z., Sharaf, M.A. (eds) Web Technologies and Applications. APWeb 2011. Lecture Notes in Computer Science, vol 6612. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20291-9_45
Download citation
DOI: https://doi.org/10.1007/978-3-642-20291-9_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20290-2
Online ISBN: 978-3-642-20291-9
eBook Packages: Computer ScienceComputer Science (R0)