Category Archives: Automatic recognition of Palaces based on Pictures (2015/T3)

Automatic Recognition of Palaces based on Pictures: Step 3

In the previous phase of our project, we achieved a comprehensive understanding of the Django web development framework and created a moderate size labelled data-set[1][2]. We further extended the size of the data-set by selecting random patches from the images. In this edition of our blog, we shall describe the progress achieved towards reaching our end goal, i.e, to recognize palaces based on pictures.


We have made rapid developments in this part of the project. The basic version of the front-end of the website and the database for storing and retrieving the images is complete. We now have a functioning front-end that takes an image and verifies whether or not it is a valid image and stores it in a static directory on our filesystem and is associated with a database model in Django. Once an image is uploaded, the user will be redirected to a new page where the uploaded image is retrieved and displayed to the user. This is our test for the basic functionality of the website.

Screen Shot 2015-04-22 at 18.07.47
                        Fig.1 Model front-end
Screen Shot 2015-04-22 at 18.08.15
   Fig.2 Retrieved sample uploaded image from database

The only essential step in this part of the project would be to integrate the image processing setup built in python with that of the website. As mentioned in the previous blog post, we have already tested the integration of OpenCV and several other libraries performing the processing on the back-end of the website. We have also tested embedding Google maps along with the images based on the final result into our website. We are working towards adding many other features into the database(at present it only stores images). We are primarily looking at including more information, neighboring locations, some historical accounts associated with the palaces as time permits.

Palace Recognition: Phase III

In this phase, we performed classification of Venetian Palace data-set into 60 different categories which we created from scanning images from [1], [2]. We used Random Trees using spatial pyramid matching kernel for image classification. We used the developed extension of the SPM method, by generalizing vector quantization to sparse coding followed by multi- scale spatial max pooling, based on SIFT sparse codes. Our pipeline for classification from Palace data-set has three stages- SIFT feature extraction, Spatial pooling and Linear Classification as shown in the figure.

Fig.3 Classification approach[3]
We tried different features such as SIFT, HOG, DSIFT and among others. In addition to that, we tried different classifier for this supervised learning like SVM, Random Forest, Boost, K-Nearest Neighbor. However, we found that DSIFT feature followed by spatial pooling and application on Random forest yield better classification accuracy than other combination.

Results and Challenges

We had 60 different categories with each having around 30 images. We used 80% of images for training and rest 20% for testing. We found that our learning model gets over-fitted due to lack of images in the data-set because we get an accuracy of 100% for training set of images and 25% exactly classification for testing set. We get a low accuracy as we tested our model with just 30 images for each category. Hence, we expect to get better results by increasing the number of images in the data-set.


We are almost done with the front-end of the website and we would now majorly focus on increasing the efficiency of our classification algorithms by collecting more images. Finally, we shall also make our website appearance more cooler and much more informative to the users.


[1] Rössler, Jan-Christoph. I Palazzi Veneziani. Venezia: Fondazione Giorgio Cini, 2010. Print.

[2] Fasolo, Andrea. Palazzi Di Venezia. Venezia: Arsenale, 2003. Print.

[3] J. Yang , K. Yu, Y. Gong, T. Huang. “Linear spatial pyramid matching using sparse coding for image classification.” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009.