Recognizing copies in paintings – post 3

Since last blogpost, we focused on the HoG (Histogram of Oriented Gradient) feature descriptors and how to compare them. Up to now, we worked on retrieving only full copies of paintings.

Extraction

The HoG represents the distribution of intensity gradients or edge’s directions in an image. The image is divided in cells and the HoG is computed for all the pixels contained in a cell for each cell. The choice of the cell size defines the wanted precision for the descriptor (coarse description of the shape or fine details). To improve the results and be more robust against illumination variations or shadows, it is possible to normalize the local histograms with respect to the surrounding intensity by computing the intensity in larger regions, called blocks, and then normalizing all the cells contained in a block by this value. Each HoG is a vector which size depends on the image and cell size. In our case, we describe an image by one HoG descriptor, thus we end up with one feature vector per image.

Comparison

Now, the idea is to compare the HoG feature vector from a query image to all the ones in a given database. To do this, we first normalize every feature vector by subtracting their mean, these will be the feature vectors we use. Then we tried two techniques:

  • The first one is to find in the database the k nearest neighbours feature vectors (in the Euclidian distance sense) using k-Nearest Neighbour (k-NN) method
  • The second one uses a simple score function that computes the scalar product between two feature vectors. We then choose the k higher scores.

GUI

We also begin to implement a GUI (Graphical User Interface) that allows the user to :

  • load a database folder containing the images or a .mat file containing all the features vectors
  • load the query image
  • Chose the method (kNN or score)
  • visualize the 4 best matches corresponding to the chosen method

An example of the results we obtain using k = 4 :

bp3ceneKNN
Fig.1 : Results for L’Ultima Cena using kNN method
bp3cenescore
Fig 2 : Results for L’Ultima Cena using score method
bp3autscore
Fig. 3 : Results for Autunno using score method
bp3autKNN
Fig.4 : Results for Autunno using kNN method

These results show the difference in performance that we can have between the two methods depending on the image we query. For L’Ultima Cena, we see that score method performs a little better than kNN, but in Autunno it’s the inverse. We also noticed that changing the cell size the results were different and we could chose a cell size that gave optimal results. Understanding the reasons and origins of these differences in performance will allow significant improvement in retreiving copies.

Next stage

The next thing to do will be to improve the actual results by customizing the parameters of the HoG extraction. We need to optimize the cell size, block size and the number of overlapping cells in blocks for our type of application (maybe do a quick literature review on application similar to ours). Then we will also need to be able to retrieve partial copies, so we will need to decide on how we will split images to have smaller part of the images to compute HoG. This means also that we will need to update the actual GUI with the ROI option of the previous one (see blogpost 2). Another thing we need to improve is our database for testing. We will need to have more paintings that have at least 4-5 copies so that we can try the algorithm in different kinds of paintings.

References

  1. Data-driven Visual Similarity for Cross-domain Image Matching