GIS Layers & Infectious Model Refinement

In this blog post we will expose the current advancement of our project, which is a development of the ideas presented in the former post.

With the help of our supervisor, we were able to get access to GIS (Geographic Information System) layers of the canals and streets of Venice. These layers are defined in a node-edge fashion and can be extracted from 2 text files:

  • One file carrying a list of more than 6000 nodes representing crossroads, defined in Cartesian coordinates
  • The other file carrying a list of more than 7000 edges along with the IDs of the 2 nodes that are connected


These layers can be easily visualized with the QGIS software, showing a realistic map of Venice with connections that realistically represent what could be the human interaction routes in the past centuries. Therefore, we defined a strategy for importing the data of those layers into MATLAB, and tested the quality of data by plotting it. This confirmed that nodes are displayed correctly, however some aberrant connections between highly distant nodes appeared. After a discussion with our supervisor, it was established that this issue was due to the fact that one edge can connect more than 2 nodes together (i.e. 2 extreme nodes and some intermediate nodes) and that the path taken between the first and last nodes is not yet available. Apparently, some other students already ran into this issue while working on the same dataset, and could luckily provide us with a clean dataset.


The unexpected access to a geographic data of a larger span and higher accuracy led us to redefine some aspects of the project. Keeping in mind that the specificity of that project relies in the fact that we have to take into account both spatial and temporal dimensions in order to describe the very special context of Venice, we decided to mix 2 different models which would generate new data at each time step:

  • A basic SIR model applied locally at each node, allowing to compute the temporal evolution of the epidemic in local populations
  • A global small-world network applied between nodes, allowing to simulate travel of individuals from node to node and thus to take into account spatial parameters (population density represented by node density)


The simulation strategy is the following:

At a given time step of the epidemics wave, the local population of each node is defined by:

  • Ni(t): total population at a given node
  • Si(t): number of susceptible individuals inside this local population
  • Ii(t): number of infected individuals inside this local population
  • Ri(t): number of individuals that cannot infect/be infected anymore by the disease (most probably because of death in the case of the Plague)
  • The SIR model defines at any time step: Ni(t) = Si(t) + Ii(t) + Ri(t)


Initial conditions:

  • Ii(t) = 0 for the majority of nodes (initially free from any infected individuals)
  • Ii(t) > 0 for a few nodes containing a small portion of individuals initially carrying the epidemics into the city
  • Ni(t) ≈ 15 individuals for each node (considering a total population of ≈100’000 individuals at the time of Plague waves to be divided into the ≈6000 nodes of the Venice map, this can be later readjusted with given densities of populations per parishes obtained from the literature)

In this context, the large amount of nodes is an advantage as it fits our model: in each node, the number of people that are in contact with each other is a realistic number. This is a good way to simulate the close contact in a family.

At each time step of the simulation:

  • 1st update of the different categories of local populations according to an SIR model at each node:
    • dS/dt = -β*S(t)*I(t)
    • dI/dt = β*S(t)*I(t) – γ*I(t)
    • dR/dt = γ*I(t)

With 1/γ = Average infectious period (defined with respect to time step); β = contact rate.

  • 2nd update of local populations according to potential movements of individuals to other nodes: each node is given a proportion of moving individuals (within the susceptible and infected populations) according to:
    • its local population
    • the number of affected individuals inside it

Each individual of the “travelling population” will have to be assigned a new destination node. Therefore, 2 strategies are possible:

  • Consider a small time step (< 1 hour) and only allow travelling people to move to adjacent nodes.
  • Consider a greater time step (≈ 1 day) and allow people to move to any other node on the map


Depending on the strategy, SIR parameters (contact rate and average infection period) will have to be adapted to the time step in order to have a realistic model. Regardless of the region of interest, the probability of an individual to travel from its current node to another one will follow a power-law depending on inter-nodal distances. These probabilities will then be used as weights in the random destination assignment. Furthermore, within a time span of one day, both strategies will allow individuals to travel relatively long distances and thus give rise to eventual new outbreaks at different locations.


Although the whole model implementation is not complete, we were able to generate a minimalist simulation under MATLAB, where we could:

  • Set SIR parameters: contact rate = 0.2, average infection period = 4 days
  • load the current nodes of Venice
  • initialize local populations
  • Initially infect a small proportions of individuals in a localized region gathering 20% of the nodes
  • Run the simulation for 20 time steps of 1 day, update local populations with the SIR model and generate random travelling of some individuals
  • Plot the evolution of the percentage of infected individuals in each node as function of time

This simulation can be seen below.



Although quite minimalist, this simulation allows us to distinguish 2 patterns:

  • The rapid and uniform evolution of populations in the initial outbreak region: the proportion of infected individuals in these nodes increases very quickly and everyone gets infected after a few steps; then people stay infected for the defined period and they eventually all die (i.e. the proportion of infected individuals decreases progressively down to 0)
  • The longer and more random evolution of populations in the initially uninfected nodes: at each step, only a few of these nodes get “infected” due to the arrival of a travelling infected individual from the outbreak region. Each infected node the follows the same evolution of drastic local increase of infection followed by a progressive decrease down to 0. Therefore, it is difficult to define a common pattern in those nodes.

Despite these differences, the SIR parameters chosen here lead to an epidemic propagation of the disease, thus nearly everyone died after 20 steps.


Eventually, we would like to fine-tune our model so that we could obtain stable and unstable behaviour like we often observe in non-linear equations that would eventually show different patterns of infection spread. Susceptibility criteria can be implemented for each individual to further increase the resemblance to an epidemic outbreak. Finally, we believe that starting with a simple model that would be upgraded one step at a time is a reasonable approach for the amount of time remaining.