Skip to main content

Performance Oriented Schema Matching

  • Conference paper
Database and Expert Systems Applications (DEXA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4653))

Included in the following conference series:

Abstract

Semantic matching of schemas in heterogeneous data sharing systems is time consuming and error prone. Existing mapping tools employ semi-automatic techniques for mapping two schemas at a time. In a large-scale scenario, where data sharing involves a large number of data sources, such techniques are not suitable. We present a new robust mapping method which creates a mediated schema tree from a large set of input XML schema trees and defines mappings from the contributing schema to the mediated schema. The result is an almost automatic technique giving good performance with approximate semantic match quality. Our method uses node ranks calculated by pre-order traversal. It combines tree mining with semantic label clustering which minimizes the target search space and improves performance, thus making the algorithm suitable for large scale data sharing. We report on experiments with up to 80 schemas containing 83,770 nodes, with our prototype implementation taking 587 seconds to match and merge them to create a mediated schema and to return mappings from input schemas to the mediated schema.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Batini, C., Lenzerini, M., Navathe, S.B.: A comparitive analysis of methodologies for database schema integration. ACM Computing Surveys 18(4), 323–364 (1986)

    Article  Google Scholar 

  2. Bernstein, P.A., Melnik, S., Petropoulos, M., Quix, C.: Industrial-strength schema matching. SIGMOD Record 33(4), 38–43 (2004)

    Article  Google Scholar 

  3. Do, H.-H., Rahm, E.: Matching large schemas: Approaches and evaluation. Information Systems 32(6), 857–885 (2007)

    Article  Google Scholar 

  4. Doan, A., Madhavan, J., Dhamankar, R., Domingos, P., Halevy, A.Y.: Learning to match ontologies on the semantic web. VLDB J. 12(4), 303–319 (2003)

    Article  Google Scholar 

  5. He, B., Chang, K.C.-C., Han, J.: Discovering complex matchings across web query interfaces: a correlation mining approach. In: KDD, pp. 148–157 (2004)

    Google Scholar 

  6. Jhingran, A.: Enterprise information mashups: Integrating information, simply - keynote address. In: VLDB (2006)

    Google Scholar 

  7. Mork, P., Bernstein, P.A.: Adapting a generic match algorithm to align ontologies of human anatomy. In: ICDE (2004)

    Google Scholar 

  8. Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB J. 10(4), 334–350 (2001)

    Article  MATH  Google Scholar 

  9. Shvaiko, P., Euzenat, J.: A survey of schema-based matching approaches. J. Data Semantics IV, 146–171 (2005)

    Google Scholar 

  10. Su, W., Wang, J., Lochovsky, F.: Holistic query interface matching using parallel schema matching. In: ICDE (2006)

    Google Scholar 

  11. Zaki, M.J.: Efficiently mining frequent embedded unordered trees. Fundamenta Informaticae 65, 1–20 (2005)

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Roland Wagner Norman Revell Günther Pernul

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Saleem, K., Bellahsene, Z., Hunt, E. (2007). Performance Oriented Schema Matching. In: Wagner, R., Revell, N., Pernul, G. (eds) Database and Expert Systems Applications. DEXA 2007. Lecture Notes in Computer Science, vol 4653. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74469-6_82

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74469-6_82

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74467-2

  • Online ISBN: 978-3-540-74469-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics