After our meeting with Pr. Kaplan we reoriented our work with more image processing than previously. We decided to use mainly image processing to segment objects and we no longer plan to work with skeletons. The idea is now to manage to isolate the objects from the background and then to use different metrics to compare the object with other paintings. The objects of interest are still people and we had the idea to begin with standard techniques in person detection.
We tried two directions : one automatic detection and one semi-automatic.
For the automatic detection, we looked for information on people detection in real life (since our paintings are from the Renaissance period) and learned about algorithm for pedestrian detection. We tried it on our paintings and it came out that it works quite well for standing people but not for other positions like bending or sitting people for example, or partially occluded people. Due to those partial results, we can’t detect automatically all people in the painting and so we will not be able to use only this kind of techniques, we will have to combine them with other ones.
Concerning the semi-automatic detection, we used a Matlab GUI (Graphical User Interface) to select a region of interest (ROI). If the pedestrian detection have worked well, we could have select these regions automatically. With this GUI we are able to use rectangular selection to mark the region we want to extract.
Once the region was defined, we tried standards techniques for extracting only the person and the one that worked pretty well was the active contour method. The idea of this technique is to have a first idea of the contour, in our case the rectangle that defines the ROI, and adding some constraints, the algorithm will make the contour evolve until it fits the real contour of the object.
The objects (persons) are then converted into bitmap images (only two levels of gray : black and white) and compared to the entire paintings. The comparison is done between the segmented object and the bitmap image of the entire painting. We then use a simple logical XOR operation between the sliding segmented object and the painting. For each position of the sliding object image, we do a sum of the result and we look for the position where this comparison is minimal (reminder : with XOR operation, two identical values will give 0 as result, and two different will give 1).
The results we have with this method are encouraging but they are results from an ‘easy’ example (the objects were rather well defined). It is not always the case and on other examples, it is way more difficult to extract the person. It depends on the luminosity and the contrast between the person and its background. This variability leads to constraints and parameters for the active contour algorithm that change a lot depending on the image, and prevents this method to be generic because we would have to tune the parameters for each image. It is also not a miracle method : there is examples were people are extracted correctly but the matching with other paintings is not done.
Since none of these method works well with different kind of images, we decided to try object extraction techniques which are invariant to light conditions, colours and textures. One of these techniques is used in the pedestrian detection algorithm and is called the Histogram of Oriented Gradients (HOG). The principle is to count occurrences of gradient orientation in localized portions of an image [Wikipedia]. The idea now is to extract the HOG descriptors on all the images in order to create a database, and then to use the HOG features of our query image or ROI (obtained with our GUI) to find similar images in the database. The main challenge is to find an efficient way to compare HOG features and we are currently working on this topic.