Skip to main content

LinksB2N: Automatic Data Integration for the Semantic Web

  • Conference paper
On the Move to Meaningful Internet Systems: OTM 2009 (OTM 2009)

Abstract

The ongoing trend towards open data embraced by the Semantic Web has started to produce a large number of data sources. These data sources are published using RDF vocabularies, and it is possible to navigate throughout the data due to their graph topology. This paper presents LinksB2N, an algorithm for discovering information overlaps in RDF data repositories and performing data integration with no human intervention over data sets that partially share the same domain.

LinksB2N identifies equivalent RDF resources from different data sets with several degrees of confidence. The algorithm relies on a novel approach that uses clustering techniques to analyze the distribution of unique objects that contain overlapping information in different data graphs. Our contribution is illustrated in the context of the Market Blended Insight project by applying the LinksB2N algorithm to data sets in the order of hundreds of millions of RDF triples containing relevant information in the domain of business to business (B2B) marketing analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alani, H., Dasmahapatra, S., Gibbins, N., Glaser, H., Harris, S., Kalfoglou, Y., O’Hara, K., Shadbolt, N.: Managing reference: Ensuring referential integrity of ontologies for the semantic web. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, pp. 317–334. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  2. Arens, Y., Knoblock, C.A.: Sims: Retrieving and integrating information from multiple sources. In: SIGMOD Conference, pp. 562–563 (1993)

    Google Scholar 

  3. Correndo, G., Alani, H.: Collaborative support for community data sharing. In: The 2nd Workshop on Collective Intelligence in Semantic Web and Social Networks (December 2008)

    Google Scholar 

  4. Fellegi, I.P., Sunter, A.B.: A theory for record linkage. Journal of the American Statistical Association 64(328), 1183–1210 (1969)

    Article  Google Scholar 

  5. Jaffri, A., Glaser, H., Millard, I.: Uri identity management for semantic web data integration and linkage. In: 3rd International Workshop On Scalable Semantic Web Knowledge Base Systems, Springer, Heidelberg (2007)

    Google Scholar 

  6. Kalfoglou, Y., Schorlemmer, M.: Ontology mapping: the state of the art. Knowledge Engineering Review 18(1), 1–31 (2003)

    Article  Google Scholar 

  7. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Technical Report 8 (1966)

    Google Scholar 

  8. Mena, E., Illarramendi, A., Kashyap, V., Sheth, A.P.: Observer: An approach for query processing in global information systems based on interoperation across pre-existing ontologies. Distributed and Parallel Databases 8(2), 223–271 (2000)

    Article  Google Scholar 

  9. Newcombe, H.B., Kennedy, J.M.: Record linkage: making maximum use of the discriminating power of identifying information. Commun. ACM 5(11), 563–566 (1962)

    Article  Google Scholar 

  10. Preece, A.D., Hui, K.-y., Gray, W.A., Marti, P., Bench-Capon, T.J.M., Jones, D.M., Cui, Z.: The kraft architecture for knowledge fusion and transformation. Knowl.-Based Syst. 13(2-3), 113–120 (2000)

    Article  Google Scholar 

  11. Salvadores, M., Zuo, L., Imtiaz, S.M.H., Darlington, J., Gibbins, N., Shadbolt, N., Dobree, J.: Market blended insight: Modeling propensity to buy with the semantic web. In: International Semantic Web Conference, pp. 777–789 (2008)

    Google Scholar 

  12. Volz, J., Bizer, C., Gaedke, M., Kobilarov, G.: Silk - A Link Discovery Framework for the Web of Data. In: 18th International World Wide Web Conference (2009)

    Google Scholar 

  13. Wiederhold, G.: Mediators in the architecture of future information systems. IEEE Computer 25(3), 38–49 (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Salvadores, M., Correndo, G., Rodriguez-Castro, B., Gibbins, N., Darlington, J., Shadbolt, N.R. (2009). LinksB2N: Automatic Data Integration for the Semantic Web. In: Meersman, R., Dillon, T., Herrero, P. (eds) On the Move to Meaningful Internet Systems: OTM 2009. OTM 2009. Lecture Notes in Computer Science, vol 5871. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05151-7_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-05151-7_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-05150-0

  • Online ISBN: 978-3-642-05151-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics