VenitianBot

Man is by nature a social animal

– Aristotle, Politics

With the rise of social media, we share more and more information about where we are, what we do, what we like online. What if we could harness this knowledge, learn from it and use it to show people more about the places they visit and take in picture, the things they like and talk about? With the digitization of the data in Venice’s archives, we will have so much information and the best way to share them with the world is social media.

Objectives:

Build an autonomous program that tweets about Venice and interacts on Twitter with users who tweet about Venice. It will filter interesting tweets based on the shared location, hashtags, the content of the post and, why not, pictures. Once the filtering is done, the bot will find relevant and interesting information (by querying a database for example) and tweet it back. The VenitianBot will share fascinating historical facts about places, buildings and events, accordingly.

Deliverables:

  1. The VenitianBot implementation (probably in Java or Python) on GitHub
  2. Documentation on VenitianBot: how to deploy it, use it and modify it
  3. Report about the features of VenitianBot and possible improvements

Methodology:

1. What will the VenitianBot tweet about?

The Venice Time Machine project will gather a huge amount of data about the history of Venice. This information can be spread through social media and let the entire world know about the work of the Digital Humanities lab.

To achieve this, we will use the Twitter platform. In order to gain popularity and have a real impact, we need to have some kind of interaction with the users. We can think of some kind of mechanism to recognize interesting tweets and reply with historical and fun facts, photographs, drawings, paintings or a link to the Venice Time Machine project.

In order to give more content to the user, we will add hashtags #EPFL, #VeniceTM (Venice Time Machine ), #DigitalHumanities, etc …

2. How do we recognize relevant tweets?
  • Hashtags

The obvious answer is the hashtags in the posts of Twitter. People tweeting about Venice often add hashtags as #Venice, #Italy, #Rialto, #GrandCanal, etc…

hashtags However after observing the actual tweets, we have noticed that we can’t decide only by looking at the hashtags, for example when people tweet about Venice Beach in California, we wouldn’t like to respond/react.

Screen Shot 2014-12-09 at 12.17.36

  • Natural language processing

We can use natural language processing to extract more information from the post. We can learn where a person is tweeting from or what she is tweeting about in Venice. This information may not be present in the hastags and the geo-localization may not be active. We will use it to give more targeted information to the user. We can also filter tweets that we don’t care about, like the promotion shown below.

Screen Shot 2014-12-09 at 12.20.08

  • Geo-localization

When a user sends a tweet using a smart phone, he has the possibility to share his actual location.

veniceLocation_circle1This can be used to locate the origin of a tweet without relying on the hashtags that can be, as already said, sometimes misleading. In order to know which place corresponds to the coordinates, we will have to build circles enclosing the places that we want to be able to identify. One problem is that the GPS reading of the phone when it sends a tweet might not be very accurate, so we have to take that into account when defining the radius of each circle. However, we expect this inaccuracy to be rather small (an experiment shows that in open sky around 90% of the GPS readings are within three meters of the real position [1]).

  • Pictures

This part will not be possible to implement within the time frame of this project but since an other DH project does something related [2], maybe it can be merged or used as a black box by the VenitianBot in order to recognize the Rialto Bridge for example or St. Mark’s square and bring more precise and relevant information to the user.

3. How do we handle tweets directed to the VenitianBot?

Now, what if a user is very interested in more information and asks the bot something? What should we do? There are two answers and one of them is simpler than the other.

  • Pre recorded replies

At first, because the bot will not be very advanced, we will create replies to send the user to other sources of information such as the digital humanities lab or the Venice Time Machine project. That way, he/she will not feel abandoned by the bot when in need of better, more precise information.

  • Natural Language processing

This part is much more Sci-Fi to achieve within the time frame given but maybe in the future, someone will take this project and add features such as this one. It would be able to answer simple questions, redirect to more relevant sources, etc …

Milestones:

  • Week 1 to 3: Skeleton of the bot
  • Week 4 to 7: Tweet recognition
  • Week 8 to 10: Defining replies
  • Week 11 to 13: Testing with database
  • Week 14: Presentation

References:

1. Smartphones, Tablets and GPS Accuracy, Jeff Shaner, July 15, 2013

2. Automatic recognition of Palaces based on Pictures