Advertisement

A Matching Algorithm for Electronic Data Interchange

  • Rami Rifaieh
  • Uddam Chukmol
  • Nabila Benharkat
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3811)

Abstract

One of the problems in the actual electronic commerce is laid on the data heterogeneity (i.e. format and vocabulary). This representation incompatibility, particularly in the EDI (Electronic Data Interchange), is managed manually with help from a human expert consulting the usage guideline of each message to translate. This manual work is tedious, error-prone and expensive. The goal of this work is to partially automate the semantic correspondence discovery between the EDI messages of various standards by using XML Schema as the pivot format. This semi-automatic schema matching algorithm take two schemata of EDI messages as the input, compute the basic similarity between each pair of elements by comparing their textual description and data type. Then, it computes the structural similarity value basing on the structural neighbors of each element (ancestor, sibling, immediate children and leaf elements) with an aggregation function. The basic similarity and structural similarity values are used in the pair wise element similarity computing which is the final similarity value between two elements. The paper shows as well some implementation issues and a scenario of test for EX-SMAL with messages coming from EDIFACT and SWIFT standards.

Keywords

Match Algorithm Structural Neighbor Textual Description Schema Match Basic Similarity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Alexiev, V., et al.: Information Integration with Ontologies, Experiences from an Industrial Showcase. Willey & Sons Publisher, West Sussex (2005) ISBN 0-470-01048-7Google Scholar
  2. 2.
    Berlin, J., Motro, A.: Database Schema Matching Using Machine Learning with Feature Selection. In: Pidduck, A.B., Mylopoulos, J., Woo, C.C., Ozsu, M.T. (eds.) CAiSE 2002. LNCS, vol. 2348, p. 15. Springer, Heidelberg (2002)Google Scholar
  3. 3.
    Bouquet, P., Magnini, B., Serafini, L., Zanobini, S.: A SAT-based algorithm for Neighborhood Matching. Technical Report, DIT-03-005, Informatica e Telecommunicazioni, University of Trento, Italy. 14 pages (February 2003)Google Scholar
  4. 4.
    Chukmol, U., Rifaieh, R., Benharkat, N.: EX-SMAL: an EDI/XML Schema Matching Algorithm. In: The Proceedings of IEEE Conference on E-Commerce (to appear)Google Scholar
  5. 5.
    Castano, S., Ferrara, A., Montanelli, S.: H-MATCH: An Algorithm for Dynamically Matching Ontologies in Peer-based Systems. In: Proceedings of the SWDB 2003 Conference, Berlin, Germany, September 2003, pp. 231–250 (2003)Google Scholar
  6. 6.
    Do, H.-H., Rahm, E.: COMA - A system for flexible combination of Schema Matching approaches. In: Proceeding of the 28th VLDB Conference, Hong Kong, China, August 2002, pp. 610–621 (2002)Google Scholar
  7. 7.
    Doan, A., Domingos, P., Halevy, A.: Reconciling Schemas of Disparate Data Sources: A Machine Learning Approach. In: Proceedings of ACM SIGMOD International Conference on Management of Data, Santa Barbara, California, USA, May 21-24, pp. 509–520 (2001)Google Scholar
  8. 8.
    Doan, A.H., Madhavan, J., Domingos, P., Halevy, A.: Learning to map between Ontologies On the Semantic Web. In: Proceeding of the 11th International Conference on World Wide Web, Honolulu, Hawaii, USA, May 7-11, pp. 662–673 (2002)Google Scholar
  9. 9.
    Do, H.-H., Melnik, S., Rahm, E.: Comparison of Schema Matching Evaluations. In: Proceedings of the GI Workshop “Web and Database”, Erfurt, October 2002, pp. 221–237 (2002)Google Scholar
  10. 10.
    Hofreiter, B., Huemer, C., Klas, W.: ebXML: Status, Research Issues and Obstacles. In: The proceedings of the 12th International Workshop on Research Issues in Data Engineering: Engineering e-Commerce/e-Business Systems (RIDE 2002), San José, California, USA, February 24-25, pp. 7–16 (2002)Google Scholar
  11. 11.
    Grossman, D., Frieder, O.: Information Retrieval Algorithms and Heuristics. Kluwer Academic Publishers, Dordrecht (1998)zbMATHGoogle Scholar
  12. 12.
    Kang, J., Naughton, J.F.: On schema matching with Opaque column names and data values. In: Proceeding of the 2003 ACM SIGMOD International Conference on Management of Data and Symposium on Principles of Database Systems, San Diego, California, USA, pp. 205–216 (2003)Google Scholar
  13. 13.
    Kurgan, L., Swiercz, W., Cios, K.J.: Semantic mapping of XML tags using inductive machine learning. In: Proceedings of the 2002 International Conference on Machine Learning and Application (ICMLA 2002), Las Vegas, Nevada, USA, pp. 99–109 (2002)Google Scholar
  14. 14.
    Madhavan, J., Bernstein, P.A., Rahm, E.: Generic Schema Matching with Cupid. In: Proceedings of the 27th VLDB Conference, Rome, Italy, pp. 49–58 (2001)Google Scholar
  15. 15.
    Madhavan, J., Bernstein, P.A., et al.: Corpus-based Schema Matching. In: Proceedings of the 18th International Joint Conference on Artificial Intelligent (IJCAI 2003), Acapulco, Mexico, pp. 49–53 (2003)Google Scholar
  16. 16.
    Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity Flooding: A Versatile Graph Matching Algorithm and its Application to Schema Matching. In: Proceedings of the 18th International Conference on Data Engineering (ICDE), San Jose, California, USA, 12 pages (2002)Google Scholar
  17. 17.
    Melnik, S., Rahm, E., Bernstein, P.A.: Rondo: A programming Platform for Generic Model Management. In: Proceedings of SIGMOD Conference, San Diego, California, USA, June 9-12, pp. 193–204 (2003)Google Scholar
  18. 18.
    Miller, R.J., et al.: The Clio project: managing the heterogeneity. SIGMOD Record 30(1), 78–83 (2001)CrossRefGoogle Scholar
  19. 19.
    Rahm, E., Bernstein, P.A.: On Matching Schema Automatically. Technical Report 1/2001, Department of Computer Science, University of Leipzig, Germany, 29 pagesGoogle Scholar
  20. 20.
    Rifaieh, R., Benharkat, N.A.: Query based Data Warehousing Tool‘. In: Proceedings of the 5th ACM international workshop on Data Warehousing and OLAP table of contents, McLean, Virginia, USA, pp. 35–42 (2002)Google Scholar
  21. 21.
    Rifaieh, R., Benharkat, N.A.: An Analysis of EDI Message Translation and Message Integration Problem. In: Proceedings of the CSITeA 2003, Rio De Janeiro, Brazil, June 2003, 8 pages (2003)Google Scholar
  22. 22.
    Rifaieh, R., Benharkat, N.A.: A Framework for EDI Message Translation. In: Proceedings of the ACS/IEEE Conference AICCSA 2003, Tunis, Tunisia, July 2003, 10 pages (2003)Google Scholar
  23. 23.
    Wang, G., et al.: Critical Points for Interactive Schema Matching. In: Proc. of the 6th Asia-Pacific Web Conference, APWeb 2004, Hang Zhou, China, April 14-17 (2004)Google Scholar
  24. 24.
    Xu, L., Embley, D.W.: Discovering Direct and Indirect Matches for Schema Elements. In: Proceedings of the DASFAA 2003 Conference, Kyoto, Japan, March 2003, pp. 39–46 (2003)Google Scholar
  25. 25.
    Yatskevitch, M.: Preliminary Evaluation of Schema Matching Systems. Technical Report, DIT-03-028, Department of Information and Communication Technology, University of Trento, Italy, 13 pages (November 2003)Google Scholar
  26. 26.
    Zamboulis, L.: XML Schema Matching & XML Data Migration & Integration: A Step towards the semantic web vision. Technical Report, School of Computer Science and Information Systems, Birkbeck University of London, England, 20 pages (October 2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Rami Rifaieh
    • 1
  • Uddam Chukmol
    • 2
  • Nabila Benharkat
    • 3
  1. 1.San Diego Supercomputer CenterUniversity of California San DiegoLa JollaUSA
  2. 2.Computer Science DepartmentCombodia Technological InstitutePhnom PenhCambodia
  3. 3.LIRISNational Institute of Applied Science of LyonVilleurbanneFrance

Personalised recommendations