In this second blog post we are going to present all the results achieved so far as well as the changes which have been made regarding some aspects of our project.
A new database of paintings has been provided since the previous one was composed by few images, making machine learning techniques almost inapplicable. The new collection gathers about 36,000 elements which have been manually analysed rejecting all the inappropriate images. Indeed this archive contains a huge number of images we are not interested in such as photos of sculptures, buildings and representation of objects that are not suitable for our purposes. At the same time also paintings depicting too crowded scenes have been removed because the extraction of skeletons would have been greatly complicated.
While browsing through the database the recurrence of particular religious motives has been detected. For instance, the most frequent are Christ on the Cross, Madonna with Child and the dead Christ [Fig 1,2,3]. For this reason we decided to focus on the recognition of these motives which theoretically should be identified as different clusters by our algorithm. Since it makes no sense to previously define the clusters which the algorithm is supposed to find out, several “noise” images have been considered.
Processing the data
Since the database has been completed we concentrated our attention on the final realization of the Matlab interface. Considering that the interface has been shown in the previous blog post, we are now going to describe the data saving process. For each uploaded painting a text file, with the corresponding name, is created. This file contains matrices with body parts as rows and coordinates as columns [Fig 4]. For those skeletons that are not completely defined (some vertices are missing because of hidden parts) a NaN value is introduced in the matrix. The choice of using NaN values is due to a rigorous definition of the matrix and to immediately visualize the missing points just looking at the text file. It is clear that these points will be erased before launching the clustering algorithm.
Once all the data (coordinates) have been gathered it is necessary to process them in order to obtain comparable skeletons (for instance to avoid zooming problems). This goal can be achieved by implementing an appropriate normalization method which will be performed with respect to the distance between the head and the abdomen. This parameter has been chosen because it represents a good indicator describing the body size. In addition the use of a relative reference frame, centred on the head, is required to carry out this step.
Machine learning techniques
With all have been said, the most important step is to define an algorithm able to process input data and give as output accurate results. Such outputs will represent clusters of skeletons based on recurrent motives and poses.
Several machine learning techniques are nowadays used to perform this clustering analysis, among those one of the most common is called K-means.
K-means aims to partition N observations into K clusters (with K<N) and each observation belongs to only one of these groups.
The accuracy of this method is strictly related to the effectiveness of the representation of the input data (skeletons in our case).Our idea is to use the coordinates of the body parts as input of the algorithm. The coordinates, as previously said, will be referred to a local reference frame that will be centred on the head for example.
At this point we have tested this idea with a few skeletons and we are still trying to assess its reliability. Once we will have harvested more results we will be able to decide whether to keep this representation or not.