Abstract
Dealing with documents that have changed through time requires keeping track of additional metadata, for example the order of the revisions. This small issue explodes in complexity when these documents are translated. Even more complicate is keeping track of the parallel evolution of a document and its translations. The fact that this extra metadata has to be encoded in formal terms in order to be processed by computers has forced us to reflect on issues that are usually overlooked or, at least, not actively discussed and documented: How do I record which document is a translation of which? How do I record that this document is a translation of that specific revision of another document? And what if a certain translation has been created using one or more intermediate translations with no access to the original document? In this paper we addresses all these issues, starting from first principles and incrementally building towards a comprehensive solution. This solution is then distilled in terms of formal concepts (e.g., translation, abstraction levels, comparability, division in parts, addressability) and abstract data structures (e.g., derivation graphs, revisions-alignment tables, source-document tables, source-part tables). The proposed data structures can be seen as a generalization of the classical evolutionary trees (e.g., stemma codicum), extended to take into account the concepts of translation and contamination (i.e., multiple sources). The presented abstract data structures can easily be implemented in any programming language and customized to fit the specific needs of a research project.
Similar content being viewed by others
Notes
For practical examples, see the chunking system used by the Averroes project.
References
Barabucci, G. (2013). A universal delta model. PhD thesis. Università di Bologna. https://doi.org/10.6092/unibo/amsdottorato/5761.
Barabucci, G. (2016). CATview (review). Digital Medievalist, 10. https://doi.org/10.16995/dm.57.
Barabucci, G. (2017). Not a single bit in common: Issues in collating digital transcriptions of Ibn Rusd’s writings in multiple languages (Arabic, Hebrew and Latin). Presented at Digital Humanities Abu Dhabi 2017 Conference. New York University Abu Dhabi.
Barabucci, G. (2019). The CMV+P document model, linear version. In R. Bleier and V. Das Gupta (Eds.), Versioning cultural objects. IDE. (in print).
Cavoski, A. (2017). Interaction of law and language in the EU: Challenges of translating in multilingual environment. Journal of Specialised Translation, 27, 58–74.
Halverson, S. L. (1997). The concept of equivalence in translation studies: Much ado about something. Target: International Journal of Translation Studies, 9(2), 207–233. https://doi.org/10.1075/target.9.2.02hal.
Pöckelmann, M., Medek, A., Molitor, P., & Ritter, J. (2015). CATview: Supporting the investigation of text genesis of large manuscripts by an overall interactive visualization tool. Presented at Digital Humanities, DH2015, Sydney.
Saenger, P. (1997). Space between words: The origins of silent reading. Stanford University Press: Stanford. ISBN: 9780804740166.
Schäffner, C. (2001). Translation and the EU: Conditions and consequences. Perspectives: Studies in Translation Theory and Practice, 9(4), 247–261. https://doi.org/10.1080/0907676X.2001.9961422.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Barabucci, G. Tracking the evolution of translated documents: revisions, languages and contaminations. Int J Digit Humanities 1, 235–250 (2019). https://doi.org/10.1007/s42803-019-00013-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42803-019-00013-9