Using versioned trees, change detection and node identity for three-way XML merging

  • Cheng Thao
  • Ethan V. MunsonEmail author
Regular Paper


XML has become the standard document representation for many popular tools in various domains. When multiple authors collaborate to produce a document, they must be able to work in parallel and periodically merge their efforts into a single work. While there exist a small number of three-way XML merging tools, their performance could be improved in several areas. We present a three-way XML merge algorithm that is faster, uses less memory and is more precise than previous algorithms. It uses a specialized versioning tree data structure that supports node identity and change detection. The algorithm applies the traditional three-way merge found in GNU diff3 to the children of changed nodes. The editing operations it supports are addition, deletion, update, and move. The algorithm is evaluated by comparing its performance to that of the previous algorithms, using synthetically generated XML documents of a range of sizes and modified by varying numbers of random editing operations. The prototype merge tool used in these tests also includes a simple graphical interface for visualizing and resolving conflicts.


Three-way merge Collaborative editing Versioning system Algorithm XML Data structures 



This research has been supported by an HP Labs Innovation Research Grant.


  1. 1.
    International Digital Publishing Forum (2010) Accessed 15 June 2010
  2. 2.
    MOF 2 XMI Mapping (XMI) (2013) Accessed 2 May 2013
  3. 3.
    Collard ML, Maletic JI, Marcus A (2002) Supporting document and data views of source code. In: Proceedings of the 2002 ACM symposium on document engineering, DocEng ’02. ACM, New York, pp 34–41.Google Scholar
  4. 4.
    Rochkind M (1975) The source code control system. IEEE Trans Softw Eng 1(4):364–370CrossRefGoogle Scholar
  5. 5.
    Tichy WF (1985) RCS–a system for version control. Softw Pract Exp 15(7):637–654CrossRefGoogle Scholar
  6. 6.
    Morse T (1996) CVS. Linux J 21es:3.Google Scholar
  7. 7.
    Subversion (2013) Accessed 2 May 2013
  8. 8.
    Mercurial SCM (2013) Accessed 2 May 2013
  9. 9.
    Git (2013) Accessed 2 May 2013
  10. 10.
    GNU diff3 (2013) Accessed 15 June 2010
  11. 11.
    Fontaine RL (2002) Merging XML files: a new approach providing intelligent merge of XML data sets. In: Proceedings of XML Europe 2002Google Scholar
  12. 12.
    Lindholm T (2004) A three-way merge for XML documents. Proceedings of the 4th ACM symposium on document engineering. ACM Press, New York, pp 1–10Google Scholar
  13. 13.
    Boyland J, Greenhouse A, Scherlis WL (2005) The fluid IR: an internal representation for a software engineering environment. Accessed 15 June 2010
  14. 14.
    Lindholm T (2001) A 3-way merging algorithm for synchronizing ordered trees–the 3dm merging and differencing tool for xml. Master’s thesis. University of Helsinki, HelsinkiGoogle Scholar
  15. 15.
    Lam F, Lam N, Wong R (2002) Efficient synchronization for mobile xml data. In: Proceedings of the 11th international conference on information and knowledge management, CIKM ’02. ACM, New York, pp 153–160Google Scholar
  16. 16.
    Inkscape SVG editor (2013) Accessed 15 June 2010
  17. 17.
    Glips graffiti editor (2013) Accessed 15 June 2010
  18. 18.
    Khanna S, Kunal K, Pierce BC (2007) A formal investigation of diff3. In: Arvind V, Prasad S (eds) Foundations of software technology and theoretical computer science (FSTTCS)Google Scholar
  19. 19.
    Netbeans platform (2013) Accessed 15 June 2010
  20. 20.
    Rönnau S, Pauli C, Borghoff UM (2008) Merging changes in xml documents using reliable context fingerprints. Proceeding of the 8th ACM symposium on document engineering, DocEng ’08. ACM, New York, pp 52–61CrossRefGoogle Scholar
  21. 21.
    Mens T (2002) A state-of-the-art survey on software merging. IEEE Trans Softw Eng 28(5):449–462CrossRefGoogle Scholar
  22. 22.
    Myers EW (1986) An O(ND) difference algorithm and its variations. Algorithmica 1:251–266MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Al-Ekram R, Adma A, Baysal O (2005) diffX: an algorithm to detect changes in multi-version XML documents. In: Cordy JR, Kark AW, Stewart DA (eds) CASCON. IBM, UK, pp 1–11Google Scholar
  24. 24.
    Chawathe SS, Rajaraman A, Garcia-Molina H, Widom J (1996) Change detection in hierarchically structured information. Proceedings of the 1996 ACM SIGMOD international conference on management of data, SIGMOD ’96. ACM, New York, pp 493–504CrossRefGoogle Scholar
  25. 25.
    Cobena G, Abiteboul S, Marian A (2002) Detecting changes in XML documents. In: Proceedings of the 18th international conference on data engineering, pp 41–52.Google Scholar
  26. 26.
    Lanham M, Kang A, Hammer J, Helal A, Wilson J (2002) Format-independent change detection and propogation in support of mobile computing. In: Proceedings of the XVII symposium on databases (SBBD 2002), pp 27–41.Google Scholar
  27. 27.
    Wang Y, DeWitt DJ, Cai J (2003) X-diff: an effective change detection algorithm for XML documents. In: Proceedings of the 19th international conference on data engineering. Bangalore, India, pp 519–530Google Scholar
  28. 28.
    Lindholm T, Kangasharju J, Tarkoma S (2006) Fast and simple XML tree differencing by sequence alignment. Proceedings of the 2006 ACM symposium on document engineering, DocEng ’06. ACM, New York, pp 75–84CrossRefGoogle Scholar
  29. 29.
    Westfechtel B (2010) A formal approach to three-way merging of emf models. In: Proceedings of the 1st international workshop on model comparison in practice, IWMCP ’10. ACM, New York, pp 31–41Google Scholar
  30. 30.
    Schwägerl F, Uhrig S, Westfechtel B (2013) Model-based tool support for consistent three-way merging of emf models. In: Proceedings of the workshop on ACadeMics tooling with eclipse, ACME ’13. ACM, New York, pp 2:1–2:10Google Scholar
  31. 31.
    Abdessalem T, Ba ML, Senellart P (2011) A probabilistic xml merging tool. In: Proceedings of the 14th international conference on extending database technology, EDBT/ICDT ’11. ACM, New York, pp 538–541Google Scholar
  32. 32.
    Ba ML, Abdessalem T, Senellart P (2013) Uncertain version control in open collaborative editing of tree-structured documents. Proceedings of the 2013 ACM symposium on document engineering, DocEng ’13. ACM, New York, pp 27–36CrossRefGoogle Scholar
  33. 33.
    Vion-Dury J-Y (2010) Diffing, patching and merging xml documents: toward a generic calculus of editing deltas. Proceedings of the 10th ACM symposium on document engineering, DocEng ’10. ACM, New York, pp 191–194CrossRefGoogle Scholar
  34. 34.
    Vion-Dury J-Y (2011) A generic calculus of xml editing deltas. Proceedings of the 11th ACM symposium on document engineering, DocEng ’11. ACM, New York, pp 113–120CrossRefGoogle Scholar
  35. 35.
    Thao C (2012) A configuration management system for software product line. PhD thesis. University of Wisconsin-Milwaukee, MilwaukeeGoogle Scholar
  36. 36.
    Thao C, Munson EV (2011) Version-aware XML documents. Proceedings of the 11th ACM symposium on document engineering, DocEng ’11. ACM, New York, pp 97–100CrossRefGoogle Scholar
  37. 37.
    LibreOffice (2013) Accessed 2 May 2013
  38. 38.
    Pandey M, Munson EV (2013) Version aware libreoffice documents. Proceedings of the 2013 ACM symposium on document engineering, DocEng ’13. ACM, New York, pp 57–60CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.Department of Mathematical and Computer SciencesUniversity of Wisconsin-WhitewaterWhitewaterUSA
  2. 2.Computer ScienceUniversity of Wisconsin-MilwaukeeMilwaukeeUSA

Personalised recommendations