The present project belongs to the wide field of research known as automatic recognition of patterns. This technique will be applied to paintings in order to extrapolate features which are recurrent in the history of art, such as the religious ones. In particular we focused on the identification of specific poses of depicted people. Each person has been associated with a skeleton retrieved thanks to an appropriately designed interface. From a technical point of view, we associate each person with its skeleton, that can be manually retrieved thanks to a previously designed interface. From all the pictures we gathered around one thousand skeletons whose coordinates are stored in a high dimension matrix.
Approaches and issues
The chosen unsupervised machine learning algorithm is the K-means clustering method, since it perfectly fits our purposes of grouping people depending on their poses. Indeed this method aims at the creation of clusters by partitioning input observations and by minimizing the within-cluster sum of squares as discrimination technique.
At a later stage we also tried to find some appropriate alternatives to the K-means in order to improve the results and thus the clustering accuracy. After an intensive literature review we found another K-means based method  which is meant to be an improvement of the standard one. However, the implementation
of the new algorithm did not show significant differences meanwhile increasing the computational cost. For this reason it has not been taken into account and therefore all the results presented hereafter are achieved thanks to the standard K-means.
Since the K-means requires as an input a data matrix composed by observations (skeletons) represented by their features (Euclidean coordinates), its accuracy is strictly related to the completeness of the input information. Consequently every missing point in a skeleton causes a lack of information which will cause implications for the algorithm accuracy. This is the main issue which has to be overcome since overlapped painted body parts are very common in paintings.
As we have discussed in the last blog post two different solutions for this problem have been proposed and tested. The first one is based on the usage of head coordinates for the missing points while the second one substitutes them with the average of the available coordinates coming from the other skeletons and corresponding to the same body parts. The evaluation of these two methods did not highlight any remarkable differences and thus they will be both used indiscriminately.
Representing the results
Each of the input data (skeleton) processed by our algorithms is a twelve dimensional vector. Since a twelve-dimensional space cannot be represented directly in a 2D space, a proper method was found in order to visualize the results of the clustering procedure. For this reason Principal Components Analysis (PCA) has been chosen.
PCA is a statistical procedure which was developed in order to simplify data investigation. It is aimed at the conversion of a set of observations of possibly correlated variables into a set of variables called principal components. Since the number of principal components is lower than or equal to the number of original variables, this procedure allows an easy visualization of all the observations through a significant reduction of the dimensions which have been identified during the data collection phase. The principal components analysis can be used to find clusters in a set of data, therefore PCA approach is found to be a suitable tool for the visualization of the results obtained by our clustering algorithm. In so doing the number of components has been chosen as two so that a scatter plot can be easily defined.
By applying the PCA to the skeletons and to the clusters’ centroids estimated by the K-means, several scatter plots have been found. The following picture shows only the skeletons that are the closest to the respective centroid [Fig 1].
Each black star and the corresponding number represent a centroid associated with a cluster, while the filled points are the skeletons. Since the K-means requires as an input a predefined number of clusters, K = 9 and K = 8 were identified to be suitable in term of clustering accuracy.
It is noteworthy how the skeletons database has been correctly represented by PCA and attributed to the respective centroid. For each group a different color has been chosen obtaining a clear spreading of clusters in the principal components space. The charts look as composed by several data clouds with different levels of distinction. Isolated clusters are usually well-defined and they correctly group skeletons of the same nature. However, in some cases it could happen that the zones with high data densification are characterized by a lower level of accuracy. It is worth to mention that most of the skeletons which are characterized by missing points fall into these areas as highlighted in [Fig 2].
Since these areas have a high density of skeletons the algorithm tries to minimize the number of clusters associated with them. Indeed more clusters will be identified where the data are much broader. For this reason the K-means commits some errors in the pose identification corresponding to those skeletons which fall into the “crowded” areas. They are characterized by a low training error because skeletons with low number of vertices tend to assume similar values in the principal components space even though in reality they would be imprecisely classified. As a result, inside these zones there will be different poses presumably being grouped on a single cluster.
In order to have a quantitative idea about the errors related to the different clusters, the normalized sums of the distances between the skeletons of the same cluster and the corresponding centroid were included [Table 1].
|Cluster number||Average error|
Table 1: errors with respect to the centroids
The presented system, including a graphical user interface used to create the skeletons from paintings and a machine learning algorithm, is a good starting point to continue the application of such techniques in the automatic recognition of patterns within different paintings.
By using skeletons of people as an input data, the algorithm is completely painter’s style invariant and at the same time can be also applied to low resolution images. Moreover due to the implemented normalization procedure the characteristic length scale of the depicted people does not affect the accuracy of the results. Finally by using directly coordinates to feed the K-means rotated or mirrored copies of the same people in different patintings can be correctly identified as well.
As shown in the results and in the pictures of the present and the previous blog posts, this system can correctly recognize particular features such as Christ on the cross, dead Christ and some other particular poses [Fig 3,4,5,6]. Regarding other skeletons such as those representing standing people, it can provide more general information on the orientation of the body.
1. Ng et al, “On spectral clustering: analysis and an algorithm”, Berkeley University.