Skip to main content

A Clustering-Based Approach for Large-Scale Ontology Matching

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6909))

Abstract

Schema and ontology matching have attracted a great deal of interest among researchers. Despite the advances achieved, the large matching problem still presents a real challenge, such as it is a time-consuming and memory-intensive process. We therefore propose a scalable, clustering-based matching approach that breaks up the large matching problem into smaller matching problems. In particular, we first introduce a structure-based clustering approach to partition each schema graph into a set of disjoint subgraphs (clusters). Then, we propose a new measure that efficiently determines similar clusters between every two sets of clusters to obtain a set of small matching tasks. Finally, we adopt the matching prototype COMA++ to solve individual matching tasks and combine their results. The experimental analysis reveals that the proposed method permits encouraging and significant improvements.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abiteboul, S., Suciu, D., Buneman, P.: Data on the Web: From Relations to Semistructed Data and XML. Morgan Kaufmann, USA (2000)

    Google Scholar 

  2. Algergawy, A., Nayak, R., Saake, G.: Element similarity measures in XML schema matching. Information Sciences 180(24), 4975–4998 (2010)

    Article  Google Scholar 

  3. Chiticariu, L., Hernndez, M.A., Kolaitis, P.G., Popa, L.: Semi-automatic schema integration in Clio. In: VLDB 2007, pp. 1326–1329 (2007)

    Google Scholar 

  4. Choi, N., Song, I.-Y., Han, H.: A survey on ontology mapping. SIGMOD Record 35(3), 34–41 (2006)

    Article  Google Scholar 

  5. Cohen, W.W., Ravikumar, P., Fienberg, S.E.: A comparison of string distance metrics for name-matching tasks. In: IIWeb, pp. 73–78 (2003)

    Google Scholar 

  6. Do, H.H., Rahm, E.: Matching large schemas: Approaches and evaluation. Information Systems 32(6), 857–885 (2007)

    Article  Google Scholar 

  7. Ehrig, M., Staab, S.: QOM- quick ontology mapping. In: International Semantic Web Conference, pp. 683–697 (2004)

    Google Scholar 

  8. Gal, A.: Managing uncertainty in schema matching with top-k schema mappings. Journal on Data Semantics 6, 90–114 (2006)

    Google Scholar 

  9. Guerrini, G., Mesiti, M., Sanz, I.: An Overview of Similarity Measures for Clustering XML Documents. Emerging Techniques and Technologies (2007)

    Google Scholar 

  10. Hamdi, F., Safar, B., Reynaud, C., Zargayouna, H.: Alignment-based partitioning of large-scale ontologies. In: Guillet, F., Ritschard, G., Zighed, D.A., Briand, H. (eds.) Advances in Knowledge Discovery and Management. SCI, vol. 292, pp. 251–269. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  11. Hu, W., Qu, Y., Cheng, G.: Matching large ontologies: A divide-and-conquer approach. DKE 67, 140–160 (2008)

    Article  Google Scholar 

  12. O. A. E. Initiative (2010), http://20.ontologymatching.org/

  13. Massmann, S., Rahm, E.: Evaluating instance-based matching of web directories. In: 11th Workshop on Web and Databases, WebDB (2008)

    Google Scholar 

  14. Peukert, E., Berthold, H., Rahm, E.: Rewrite techniques for performance optimization of schema matching processes. In: EDBT, pp. 453–464 (2010)

    Google Scholar 

  15. Peukert, E., Massmann, S., Konig, K.: Comparing similarity combination methods for schema matching. In: GI-Workshop, pp. 692–701 (2010)

    Google Scholar 

  16. Rahm, E.: Towards large-scale schema and ontology matching. In: Bellahsene, Z., Bonifati, A., Rahm, E. (eds.) Schema Matching and Mapping. Data-Centric Systems and Applications series, Springer, Heidelberg (2010)

    Google Scholar 

  17. Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB Journal 10(4), 334–350 (2001)

    Article  MATH  Google Scholar 

  18. Rahm, E., Do, H.-H., Massmann, S.: Matching large XML schemas. SIGMOD Record 33(4), 26–31 (2004)

    Article  Google Scholar 

  19. Seddiquia, M.H., Aono, M.: An efficient and scalable algorithm for segmented alignment of ontologies of arbitrary size. Web Semantics 7(4), 344–356 (2009)

    Article  Google Scholar 

  20. Wang, Z., Wang, Y., Zhang, S., Shen, G., Du, T.: Matching large scale ontology effectively. In: Mizoguchi, R., Shi, Z.-Z., Giunchiglia, F. (eds.) ASWC 2006. LNCS, vol. 4185, pp. 99–105. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  21. Yuruk, N., Mete, M., Xu, X., Schweiger, T.A.J.: AHSCAN: Agglomerative hierarchical structural clustering algorithm for networks. In: International Conference on Advances in Social Network Analysis and Mining, pp. 72–77 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Algergawy, A., Massmann, S., Rahm, E. (2011). A Clustering-Based Approach for Large-Scale Ontology Matching. In: Eder, J., Bielikova, M., Tjoa, A.M. (eds) Advances in Databases and Information Systems. ADBIS 2011. Lecture Notes in Computer Science, vol 6909. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23737-9_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23737-9_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23736-2

  • Online ISBN: 978-3-642-23737-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics