Bilderatlas: what could have Warburg done with today’s modern tools?

Definition

Bilderatlas is a collection of tables, made by Aby Warburg, based on fourteen different themes. Warburg was a German art historian who, in the last two years of his life, started working on this atlas. His work, however, remained incomplete [1].

For each table, he pinned on a wooden panel several pictures and paintings which share a common theme. Even with a vast knowledge on the subject, the amount of artworks he considered is very small compared to today’s available databases of images. So, implementing his wide knowledge with computing power, what could have Warburg done using modern tools?

The aim of our project is to find an answer to this question. We will try to create a sort of continuation to his work using today’s technologies. In the end, we would like to find a systematic method to search for pattern similarities in a large database of images.

Methodology

Normally, pictures in a table have visual similarities; however, in some tables there is no apparent pattern connection. As a consequence, it will be more difficult to relate pictures in the same table to their theme by using computational methods based on visual features. In any case, we are not interested in being able to find the same images that Warburg connected in a certain table; our aim is rather to continue his work on a large database of images.

Since we are dealing with pictures connected by patterns that are not strictly visual, we need to use a research method based on a high-level representation of the images. This means that the pictures must not be read just as collection of pixels, but rather in a more informative way. Therefore, we will base our analysis on deep learning techniques. In particular, Convolutional Neural Networks (CNNs) seem to be the most appropriate choice in approaching this task [2].

A neural network is a model that tries to emulate the human brain and its way of processing. In the image-processing context, the algorithm takes as input an image and extracts a feature array that characterizes it. The array will be a high-level representation of the image, which will thus allow to recognize patterns that are not strictly visual.

Setup

The first step will be to choose the appropriate tables for our analysis. To explore several possible results that can be obtained through CNNs, we will select 4 or 5 tables that have different characteristics. We will pick some tables in which images have strong visual similarities, such as the one in Figure 1, where the geometry of the circle is an evident common pattern in all the images. In addition, we will choose two tables where a pattern is not markedly evident; an example is reported in Figure 2; the human eye can see the recurring theme of the nymph, but it is a perception that we obtain from our brain elaboration. For a computer, these kind of tasks are not as easy. By choosing these two different kinds of tables and comparing the results, we will be able to discover whether CNNs are able to recognie patterns that can be more conceptual, rather than strictly visual.

Figure 1 - Table 22
Figure 1 – Table 22 [3].
Figure 2 - Table 46
Figure 2 – Table 46 [3].

Afterwards, the images of the chosen tables need to be loaded in a database that contains around 40000 pictures. This will be the source where CNNs will carry their analyses. We will load the pictures on the DH Canvas server and annotate them. The annotations will be taken directly from the captions in the Warburg tables, and will include author, title, and period of the artwork, as well as the title of the table where each image is contained. DH Canvas will in turn provide a URL address to each image, along with its ID and all related information.

To add our images to the CNNs database, we will write a piece of code in Python to address the server through its API, providing the URLs of the images obtained from DH Canvas. This web service will then add them to the 40000-image database.

Another piece of code in Python will be necessary to interact with the CNNs. It will allow to launch a query and to retrieve the results given by CNNs, visualizing the images and their relative information.

Analysis

At this point, the setup will be ready and we will start our research. We will consider CNNs as a black box, because it’s impossible to get a full understanding of the process and predict what the result will be.

When querying the database, we can provide one or more images; CNNs will search for a common pattern among the investigated pictures, and will return all the images in the database that are close to them in terms of feature arrays. Every returned image will be assigned a score that is a measure of the distance with the queried images; the score tells us how high the similarity is with the initial set of pictures. The resulting images will be visualized alongside their relative information, including the score and their ID (Figure 3), allowing us to determine if the image belongs to the original Warburg’s table.

Figure 3 – Nymph query
Figure 3 – Nymph query

An interesting feature of the CNNs server is the possibility to mark some images as “negative”. Using this approach, we are stating that a particular image must not be included in the result, meaning that the pattern we are searching for is not present in that image. This feature can be exploited when getting a result with an unwanted picture: the same query can be launched setting this image as “negative”.

For each table, we will launch several queries in a systematic way, using “negative” images to reach a good result. Firstly, we will try to select the image that seems the most representative of its table and see if, among the results, we obtain other elements belonging to the same table. The next step could be to select a group of representative pictures, rather than one single image, and observe how the results vary as certain elements are added or removed from the query.

Another approach we will follow consists of querying all the images in a table, and examining the differences when removing some images from the initial set.

Results

The last part of our project will be dedicated to the analysis of the obtained results.

Particular attention will be given to the comparison of the outcomes of the different tables. We expect this method to work quite well for the visual patterns, but it will be interesting to discover its performances in recognizing images linked by conceptual patterns. We will finally be able to estimate the effectiveness of CNNs in continuing Warburg’s work.

The present topic is well related to the Digital Humanities field since we will work on large amount of data, exploiting computational power of computers in order to solve humanistic challenges.

Stay tuned!

Milestones

Weeks 1 and 2: Select tables and annotate them. Complete the first bot that loads images in the CNNs database.

Weeks 3, 4 and 5: Complete the second bot that launches the queries and manages the results.

Weeks 6 and 7: Analyze the first two tables with all the criterions discussed before.

Weeks 8 and 9: Analyze the last two tables with all the criterions discussed before.

Weeks 10 and 11: Interprete results of CNNs on Warburg’s work to assess their reliability.

Weeks 12 and 13: Complete final report and poster.

References

[1]         https://en.wikipedia.org/wiki/Aby_Warburg

[2]         Isabella di Lenardo, Benoit Seguin, Frédéric Kaplan. Visual Patterns Discovery in          Large Databases of Paintings.

[3]         http://www.engramma.it/eOS2/atlante/