Abstract
Schema and ontology matching have attracted a great deal of interest among researchers. Despite the advances achieved, the large matching problem still presents a real challenge, such as it is a time-consuming and memory-intensive process. We therefore propose a scalable, clustering-based matching approach that breaks up the large matching problem into smaller matching problems. In particular, we first introduce a structure-based clustering approach to partition each schema graph into a set of disjoint subgraphs (clusters). Then, we propose a new measure that efficiently determines similar clusters between every two sets of clusters to obtain a set of small matching tasks. Finally, we adopt the matching prototype COMA++ to solve individual matching tasks and combine their results. The experimental analysis reveals that the proposed method permits encouraging and significant improvements.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Abiteboul, S., Suciu, D., Buneman, P.: Data on the Web: From Relations to Semistructed Data and XML. Morgan Kaufmann, USA (2000)
Algergawy, A., Nayak, R., Saake, G.: Element similarity measures in XML schema matching. Information Sciences 180(24), 4975–4998 (2010)
Chiticariu, L., Hernndez, M.A., Kolaitis, P.G., Popa, L.: Semi-automatic schema integration in Clio. In: VLDB 2007, pp. 1326–1329 (2007)
Choi, N., Song, I.-Y., Han, H.: A survey on ontology mapping. SIGMOD Record 35(3), 34–41 (2006)
Cohen, W.W., Ravikumar, P., Fienberg, S.E.: A comparison of string distance metrics for name-matching tasks. In: IIWeb, pp. 73–78 (2003)
Do, H.H., Rahm, E.: Matching large schemas: Approaches and evaluation. Information Systems 32(6), 857–885 (2007)
Ehrig, M., Staab, S.: QOM- quick ontology mapping. In: International Semantic Web Conference, pp. 683–697 (2004)
Gal, A.: Managing uncertainty in schema matching with top-k schema mappings. Journal on Data Semantics 6, 90–114 (2006)
Guerrini, G., Mesiti, M., Sanz, I.: An Overview of Similarity Measures for Clustering XML Documents. Emerging Techniques and Technologies (2007)
Hamdi, F., Safar, B., Reynaud, C., Zargayouna, H.: Alignment-based partitioning of large-scale ontologies. In: Guillet, F., Ritschard, G., Zighed, D.A., Briand, H. (eds.) Advances in Knowledge Discovery and Management. SCI, vol. 292, pp. 251–269. Springer, Heidelberg (2010)
Hu, W., Qu, Y., Cheng, G.: Matching large ontologies: A divide-and-conquer approach. DKE 67, 140–160 (2008)
O. A. E. Initiative (2010), http://20.ontologymatching.org/
Massmann, S., Rahm, E.: Evaluating instance-based matching of web directories. In: 11th Workshop on Web and Databases, WebDB (2008)
Peukert, E., Berthold, H., Rahm, E.: Rewrite techniques for performance optimization of schema matching processes. In: EDBT, pp. 453–464 (2010)
Peukert, E., Massmann, S., Konig, K.: Comparing similarity combination methods for schema matching. In: GI-Workshop, pp. 692–701 (2010)
Rahm, E.: Towards large-scale schema and ontology matching. In: Bellahsene, Z., Bonifati, A., Rahm, E. (eds.) Schema Matching and Mapping. Data-Centric Systems and Applications series, Springer, Heidelberg (2010)
Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB Journal 10(4), 334–350 (2001)
Rahm, E., Do, H.-H., Massmann, S.: Matching large XML schemas. SIGMOD Record 33(4), 26–31 (2004)
Seddiquia, M.H., Aono, M.: An efficient and scalable algorithm for segmented alignment of ontologies of arbitrary size. Web Semantics 7(4), 344–356 (2009)
Wang, Z., Wang, Y., Zhang, S., Shen, G., Du, T.: Matching large scale ontology effectively. In: Mizoguchi, R., Shi, Z.-Z., Giunchiglia, F. (eds.) ASWC 2006. LNCS, vol. 4185, pp. 99–105. Springer, Heidelberg (2006)
Yuruk, N., Mete, M., Xu, X., Schweiger, T.A.J.: AHSCAN: Agglomerative hierarchical structural clustering algorithm for networks. In: International Conference on Advances in Social Network Analysis and Mining, pp. 72–77 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Algergawy, A., Massmann, S., Rahm, E. (2011). A Clustering-Based Approach for Large-Scale Ontology Matching. In: Eder, J., Bielikova, M., Tjoa, A.M. (eds) Advances in Databases and Information Systems. ADBIS 2011. Lecture Notes in Computer Science, vol 6909. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23737-9_30
Download citation
DOI: https://doi.org/10.1007/978-3-642-23737-9_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23736-2
Online ISBN: 978-3-642-23737-9
eBook Packages: Computer ScienceComputer Science (R0)