In this blogpost, we will summarize our progress in processing paintings data since the last blogpost and the ongoing process of implementing everything graphically.
As indicated in the previous blogpost, we finally received the data that we want and has decided on the way to deal with them. There is one main constraints that we are forced to follow (we can only collect paintings that are at the moment situated in USA and UK) and one other that we decided to follow to limit the number of data for the moment (All paintings need to be painted by Italian painters).
One major problem we mentioned in the previous blogpost is that we can only extract 10,000 rows (equal to 10,000 hops in total of all the paintings). However, we sent an email to the Getty Provenance Index support team and ask for the full data and after one month of waiting, we finally received an email from them with the full data containing 22,715 rows containing around 3000-4000 paintings (as a quick estimation).
The next step to do is to create the database. The data contains, unfortunately, a few rows that have unusable data such as some hops with unknown locations or some mis-formatted data. Considering they are just the very minor flaws, we will just neglect those data using our filter. The “Location” column contains data that are also not neatly formatted, an example of which is:
What we want now is to extract the city and country name from this column. In order to do that, we downloaded a list of cities and countries along with their latitude and longitude from http://download.geonames.org/ and using “Postgis”, we matched the two data together in order to create three new columns of the painting data file that has the name of the city mentioned, the corresponding country and its latitude and longitude. Now after this step, our database is quite ready to be represented
Now that we have all the data we need, this being all the Italian paintings, and all their known movements, it’s time to put it all together and build a web application based on it.
The final product should look like a map of the world which we can zoom in and out of, with multiple markers. Each marker would represent a unique painting at a specific place and time. Clicking on the marker would then fade the other paintings from the map so that the entire migration of the selected painting becomes clearer.
Another possibility would be having a heat-map for the concentration of painting over a time period chosen by the user. However, our dataset is biased towards the USA and England. Therefore, we chose to focus on the journey of each painting.
To achieve this result, we will need to move our data from Access to PostGIS, which is a database management system that has built in spatial features. From this we will be able to add precise coordinates to each and every record, and export them to a (rather large) file which will be used by our application.