Abstract
Schema matching plays a central role in many applications that require interoperability among heterogeneous data sources. A good evaluation for different capabilities of schema matching systems has become vital as the complexity of such systems arises. The capabilities of matching systems incorporate different (possibly conflicting) aspects among them match quality and match efficiency. The analysis of efficiency of a schema matching system, if it is done, tends to be done in a way separate from the analysis of effectiveness. In this paper, we present the trade-off between schema matching effectiveness and efficiency as a multi-objective optimization problem. This representation enables us to obtain a combined measure as a compromise between them. We combine both performance aspects in a weighted-average function to determine the cost-effectiveness of a schema matching system. We apply our proposed approach to evaluate two currently existing mainstream schema matching systems namely COMA++ and BTreeMatch. Experimental results showed that, by carefully utilizing both small-scale and large-scale schemas, it is necessary to take the response time of the matching process into account especially in large-scale schemas.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bernstein, P.A., Melnik, S., Churchill, J.E.: Incremental schema matching. In: VLDB, Korea (2006)
Do, H.H., Melnik, S., Rahm, E.: Comparison of schema matching evaluations. In: the 2nd Int. Workshop on Web Databases (2002)
Do, H.H., Rahm, E.: COMA- a system for flexible combination of schema matching approaches. In: VLDB, pp. 610–621 (2002)
Do, H.-H., Rahm, E.: Matching large schemas: Approaches and evaluation. Information Systems 32(6), 857–885 (2007)
Doan, A., Domingos, P., Halevy, A.: Reconciling schemas of disparate data sources: A machine-learning approach. SIGMOD, 509–520 (2001)
Doan, A., Halevy, A.: Semantic integration research in the database community: A brief survey. AAAI AI Magazine 25(1), 83–94 (2005)
Drumm, C., Schmitt, M., Do, H.-H., Rahm, E.: Quickmig - automatic schema matching for data migration projects. In: Proc. ACM CIKM 2007, Portugal (2007)
Duchateau, F., Bellahsene, Z., Hunt, E.: Xbenchmatch: a benchmark for XML schema matching tools. In: VLDB 2007, Austria, pp. 1318–1321 (2007)
Duchateau, F., Bellahsene, Z., Roche, M.: An indexing structure for automatic schema matching. In: SMDB Workshop, Turkey (2007)
Madhavan, J., Bernstein, P.A., Rahm, E.: Generic schema matching with cupid. In: VLDB, Italy, pp. 49–58 (2001)
Marler, R., arora, J.: Survey of multi-objective optimization methods for engineering. Struct. Multidisc Optim. 26, 369–395 (2004)
Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity flooding: A versatile graph matching algorithm and its application to schema matching. In: ICDE 2002 (2002)
Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB Journal 10(4), 334–350 (2001)
Rijsbergen, C.J.: Information Retrieval, 2nd edn., London (1979)
Smiljanic, M.: XML Schema Matching Balancing Efficiency and Effectiveness by means of Clustering. PhD thesis, Twente University (2006)
Yatskevich, M.: Prelimanary evaluation of schema matching systems. Technical Report #DIT-03-028, Tornoto University (2003)
Zhang, Z., Che, H., Shi, P., Sun, Y., Gu, J.: Formulation schema matching problem for combinatorial optimization problem. IBIS 1(1), 33–60 (2006)
Zitzler, E., Thiele, L.: Multiobjective evolutionaty algorithms: A comparative case study and the strength pareto approach. IEEE Tran. on EC 3, 257–271 (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Algergawy, A., Schallehn, E., Saake, G. (2008). Combining Effectiveness and Efficiency for Schema Matching Evaluation. In: Kutsche, RD., Milanovic, N. (eds) Model-Based Software and Data Integration. MBSDI 2008. Communications in Computer and Information Science, vol 8. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78999-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-78999-4_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78998-7
Online ISBN: 978-3-540-78999-4
eBook Packages: Computer ScienceComputer Science (R0)