A Clustering-Based Approach for Large-Scale Ontology Matching

Algergawy, Alsayed; Massmann, Sabine; Rahm, Erhard

doi:10.1007/978-3-642-23737-9_30

A Clustering-Based Approach for Large-Scale Ontology Matching

Alsayed Algergawy¹⁹,
Sabine Massmann¹⁹ &
Erhard Rahm¹⁹

Conference paper

813 Accesses
31 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6909))

Abstract

Schema and ontology matching have attracted a great deal of interest among researchers. Despite the advances achieved, the large matching problem still presents a real challenge, such as it is a time-consuming and memory-intensive process. We therefore propose a scalable, clustering-based matching approach that breaks up the large matching problem into smaller matching problems. In particular, we first introduce a structure-based clustering approach to partition each schema graph into a set of disjoint subgraphs (clusters). Then, we propose a new measure that efficiently determines similar clusters between every two sets of clusters to obtain a set of small matching tasks. Finally, we adopt the matching prototype COMA++ to solve individual matching tasks and combine their results. The experimental analysis reveals that the proposed method permits encouraging and significant improvements.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abiteboul, S., Suciu, D., Buneman, P.: Data on the Web: From Relations to Semistructed Data and XML. Morgan Kaufmann, USA (2000)
Google Scholar
Algergawy, A., Nayak, R., Saake, G.: Element similarity measures in XML schema matching. Information Sciences 180(24), 4975–4998 (2010)
Article Google Scholar
Chiticariu, L., Hernndez, M.A., Kolaitis, P.G., Popa, L.: Semi-automatic schema integration in Clio. In: VLDB 2007, pp. 1326–1329 (2007)
Google Scholar
Choi, N., Song, I.-Y., Han, H.: A survey on ontology mapping. SIGMOD Record 35(3), 34–41 (2006)
Article Google Scholar
Cohen, W.W., Ravikumar, P., Fienberg, S.E.: A comparison of string distance metrics for name-matching tasks. In: IIWeb, pp. 73–78 (2003)
Google Scholar
Do, H.H., Rahm, E.: Matching large schemas: Approaches and evaluation. Information Systems 32(6), 857–885 (2007)
Article Google Scholar
Ehrig, M., Staab, S.: QOM- quick ontology mapping. In: International Semantic Web Conference, pp. 683–697 (2004)
Google Scholar
Gal, A.: Managing uncertainty in schema matching with top-k schema mappings. Journal on Data Semantics 6, 90–114 (2006)
Google Scholar
Guerrini, G., Mesiti, M., Sanz, I.: An Overview of Similarity Measures for Clustering XML Documents. Emerging Techniques and Technologies (2007)
Google Scholar
Hamdi, F., Safar, B., Reynaud, C., Zargayouna, H.: Alignment-based partitioning of large-scale ontologies. In: Guillet, F., Ritschard, G., Zighed, D.A., Briand, H. (eds.) Advances in Knowledge Discovery and Management. SCI, vol. 292, pp. 251–269. Springer, Heidelberg (2010)
Chapter Google Scholar
Hu, W., Qu, Y., Cheng, G.: Matching large ontologies: A divide-and-conquer approach. DKE 67, 140–160 (2008)
Article Google Scholar
O. A. E. Initiative (2010), http://20.ontologymatching.org/
Massmann, S., Rahm, E.: Evaluating instance-based matching of web directories. In: 11th Workshop on Web and Databases, WebDB (2008)
Google Scholar
Peukert, E., Berthold, H., Rahm, E.: Rewrite techniques for performance optimization of schema matching processes. In: EDBT, pp. 453–464 (2010)
Google Scholar
Peukert, E., Massmann, S., Konig, K.: Comparing similarity combination methods for schema matching. In: GI-Workshop, pp. 692–701 (2010)
Google Scholar
Rahm, E.: Towards large-scale schema and ontology matching. In: Bellahsene, Z., Bonifati, A., Rahm, E. (eds.) Schema Matching and Mapping. Data-Centric Systems and Applications series, Springer, Heidelberg (2010)
Google Scholar
Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB Journal 10(4), 334–350 (2001)
Article MATH Google Scholar
Rahm, E., Do, H.-H., Massmann, S.: Matching large XML schemas. SIGMOD Record 33(4), 26–31 (2004)
Article Google Scholar
Seddiquia, M.H., Aono, M.: An efficient and scalable algorithm for segmented alignment of ontologies of arbitrary size. Web Semantics 7(4), 344–356 (2009)
Article Google Scholar
Wang, Z., Wang, Y., Zhang, S., Shen, G., Du, T.: Matching large scale ontology effectively. In: Mizoguchi, R., Shi, Z.-Z., Giunchiglia, F. (eds.) ASWC 2006. LNCS, vol. 4185, pp. 99–105. Springer, Heidelberg (2006)
Chapter Google Scholar
Yuruk, N., Mete, M., Xu, X., Schweiger, T.A.J.: AHSCAN: Agglomerative hierarchical structural clustering algorithm for networks. In: International Conference on Advances in Social Network Analysis and Mining, pp. 72–77 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Leipzig, Germany
Alsayed Algergawy, Sabine Massmann & Erhard Rahm

Authors

Alsayed Algergawy
View author publications
You can also search for this author in PubMed Google Scholar
Sabine Massmann
View author publications
You can also search for this author in PubMed Google Scholar
Erhard Rahm
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Alpen Adria Universität Klagenfurt, Institut für Informatik-Systeme, Universitätsstr. 65, 9020, Klagenfurt, Austria
Johann Eder
Faculty of Informatics and Information Technologies, Slovak University of Technology in Bratislava, Ilkoviçova 3, 842 16, Bratislava, Slovakia
Maria Bielikova
Institut für Softwaretechnik, Technische Universität Wien, Favoritenstr. 9-11/188, 1040, Wien, Austria
A Min Tjoa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Algergawy, A., Massmann, S., Rahm, E. (2011). A Clustering-Based Approach for Large-Scale Ontology Matching. In: Eder, J., Bielikova, M., Tjoa, A.M. (eds) Advances in Databases and Information Systems. ADBIS 2011. Lecture Notes in Computer Science, vol 6909. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23737-9_30

Download citation

DOI: https://doi.org/10.1007/978-3-642-23737-9_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23736-2
Online ISBN: 978-3-642-23737-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics