This contribution is based on the possibility of successfully applying innovative scientific technologies to philological, art historical and bibliographical studies.

The books that were printed all over Europe from the early 1450s to 31 December 1500 are known as incunabula. Some 30,000 editions survive today, in about 450,000 copies scattered across more than 4000 public libraries and private collections all over the world.

A full inventory of the surviving books is available in the Incunabula Short Title Catalogue (ISTC) database, coordinated by the British Library (London).Footnote 1 Their typographical features are outlined in the Gesamtkatalog der Wiegendrucke (GW).Footnote 2

The history of each surviving copy (former owners, decoration, binding, manuscript annotations etc., information collectively referred to as provenance) has been traditionally described in library catalogues and is now being brought together in the Material Evidence in Incunabula (MEI) database.Footnote 3

The texts contained in each edition are being described in detail in the Text-Inc database.Footnote 4

High quality data from MEI is being used in a visualization suite to present the circulation of books over time and space.

These last three digital tools are the product of the 15cBOOKTRADE, a 5-year European Research Council-funded project led by Dr Cristina Dondi and based at the University of Oxford (Faculty of Medieval and Modern Languages/Lincoln College).

Along with types, textual content, and provenance evidence, the other fundamental component of incunabula are printed illustrations, which in the early stage of printing consisted mainly of the insertion of carved woodblocks into the printing forme.Footnote 5 Hence the name woodcuts.

At a time when the spread of printing facilitated the availability of books to wider sections of society, images in books continued to serve as visual aids in deciphering and clarifying the content of the verbal signs, as well as starting points for meditation, memory and thinking, and as a primary intellectual tool in the reading process. From time to time they were used simply to catch the reader’s attention, this resulting in a random choice of images, disconnected from the text.

With the introduction of printing, the iconographic apparatus contained in illustrated books gradually started shifting from the illuminated manuscript products, where unique decorations were carried out to fulfil the requirements of a single patron, to multiple copies, mechanically printed and widely distributed, containing illustrations that had to be appreciated and understood by a more general public.

Woodblocks soon became a fundamental part of printers’ business capital. On a par with types, paper, and the press itself, they had an economic value: they could be loaned to other printers, they were exchangeable, and they were marketable. And in fact, from the very beginning, many of the woodblocks which had been commissioned and prepared in order to illustrate early Fifteenth-century printed editions started to be copied or re-used in other editions, within the same iconographic cycle, or as single images, illustrating the same or a different text, by the same or by different printers, sometimes in different countries.

Throughout the centuries, many individual efforts have been carried out by scholars in order to explore and better understand how the role of book decoration, and the creative process it entailed, changed with the introduction of printing, and to clarify the relationship between painting, illumination and the different stages of printed production.Footnote 6

In recent years, many projects have also been fruitfully testing the application of digital technologies to different kinds of images, and to early printed images as well.Footnote 7 Nonetheless, a coordinated systematic approach, able to track the production, circulation, use and reutilization of Fifteenth-century printed woodcuts is still lacking.

In this context, the 15cBOOKTRADE has been working towards the creation of a tool for cataloguing and researching the production, use and circulation of Fifteenth-century printed woodcuts, in collaboration with the Department of Engineering Science at the University of Oxford (Visual Geometry Group, coordinated by Professor Andrew Zisserman).

The final objective is a system for searching datasets of Fifteenth-century printed images based on the integrated application of both instance-level and category-level image search. Instance-level enables all instances (prints) of a particular woodcut to be matched (retrieved from the dataset). Category-level enables all woodcuts illustrating a category (such as “containing a dog or an XX”) to be retrieved.

The first step of the instance-level process is the application of automatic object retrieval technologies, which seemed particularly suitable for tracking and locating the recurrences of the same woodblock through a potentially endless number of editions by the same or by a different printer, of the same or of a different text. After being uploaded to an online repository, each image, or a selected region of interest within the image itself (in this case, a particular woodblock or part of it), can be used as a query. The object retrieval software will automatically return all the images that contain the query region within seconds (Figs. 1, 2, 3, 4).Footnote 8

Fig. 1
figure 1

A screenshot from the 15cBOOKTRADE visual recognition searching demo (© VGG & 15cBOOKTRADE Project). The image is a digital reproduction of leaf g7r of the Aesopus moralisatus, Venice: Manfredus de Bonellis, de Monteferrato, 31 Jan. 1491, copy owned by the Biblioteca Corsiniana (Rome, Italy), shelfmark 51.E.54 (ISTC ia00151000; MEI 02011231). Part of the picture has been selected in order to be compared with the rest of the dataset

Fig. 2
figure 2

The result of the query in Fig. 1. The image-matching software detects the recurrence of the query image in one more edition: Aesopus moralisatus, Venice: Manfredus de Bonellis, de Monteferrato, 15 Feb. 1491, copy owned by the Fondazione Giorgio Cini (Venice, Italy), shelfmark FOAN TES 10, leaf g7r (ISTC ia00152000; MEI 00202205). As the reader will notice, in this case the same woodblock was re-used by the same printer within two different frames

Fig. 3
figure 3

A detailed comparison between the visual semantic regions of the two images using the “bag of visual words” method

Fig. 4
figure 4

The results of a different query in the 15cBOOKTRADE image-retrieval demo. The query image stands on the top right side of the screenshot. As the reader will notice, the presence or absence of colours, as well as the different quality of paper do not affect the effectiveness of the matching system

Technically speaking, the retrieval system first detects hundreds of points of interest and extracts corresponding visual features from the query image; afterwards, it encodes the image as a single vector considering all its visual components as if they were a “bag” (multiset) of visual words (“bag of words” method). The vector is then compared with the whole dataset and a ranked list is produced, in which those images with the strongest correspondence to the initial query vector appear first. Finally, the initial ranking list is re-ranked according to the geometric consistency between each top dataset image and the query. This instance-level object retrieval pipeline is particularly useful as it allows scholars to know exactly and quickly which images appear and where without having to physically go through dozens of physical volumes or digital reproductions, often not easily accessible. Thanks to its flexibility and reliability, this retrieval system has also been applied by the Visual Geometry Group to many other datasets, such as the British Library’s “1 million images” dataset and the Bodleian Library Ballads dataset; these applications suggested that it might be profitably applied to Fifteenth-century printed book illustrations.Footnote 9

In comparison with other categories of images, such as those found in single sheet ballads or prints, scholars aiming to catalogue and classify Fifteenth-century printed book illustrations have to deal with an additional level of complexity, due to the technical constraints of the printing process of a book, which can run to hundreds of pages and contain several illustrations, occasionally repeated. In order to find out how many and which illustrations were used in a book, it is necessary to map in detail their presence inside the book itself.

Every image in this database is named with a unique identifier which brings together three elements:

  • ISTC number,

  • MEI number of the copy portrayed in the picture,

  • foliation.

From this sequence, it becomes immediately clear in which edition, in which copy and where exactly in the copy the searched image can be found.Footnote 10

In particular, the image-matching tool is able to detect the recurrences of certain woodcuts in different editions, which enables us to explore how printed material may have been exchanged between or copied by different printers; it also harvests the recurrence of the same woodcut within a single edition. In this case, by localising precisely every single occurrence of one single woodcut throughout the edition, it becomes evident when it appears more than once within a single printing sheet, although in different combinations.

This suggests that the printer had more than one woodblock of that kind at his disposal. After completing this operation for all the woodcuts, providing the exact number and location of their recurrences throughout the book, we are able to calculate exactly how many blocks were used in one edition and in which combinations.

Reconstructing the composition of the printing sheet becomes useful when trying to analyse the working practice of a printer. While we know a lot about the type case of individual printers, we know next to nothing about their possession of woodblocks. Therefore, this systematic approach is shedding new light on Fifteenth-century printing.

Moreover, when considered not only for their artistic quality but primarily for their content and iconographic features, printed images in early editions have a special value in the reconstruction of the transmission of the text in print. By looking at the iconographic apparatus and its relationship with the text, and by investigating the sources of both the iconographic and textual tradition, scholars can assess otherwise unknown business relationships between printers, authors and illustrators. They can also explore the development of a certain iconographic cycle or artistic style, school or person. Ultimately, scholars can uncover links with the earlier, as well as with the later, manuscript and in-print transmission.