In this report, we will expose the contents of the different blogposts as well as our final results.
Our project consists in the elaboration of a model and a visualization tool showing the spread of the plague epidemics in Venice throughout the middle ages and the renaissance period. This disease was able to dismantle one of the richest cities of medieval Europe and understanding its dynamics can lead to interesting findings in the field of epidemics. In the end, we did not use any data from the register Provveditori alla Sanita (Archivio di Stato di Venezia) as we first intended. It seems that the digitization of such data is still difficult in the context of the Venice Time Machine initiative. Nevertheless, we believe that our model could be trained by existing data when such data will be available. Once necrology registers and maps are digitized, we think that our work can serve as a framework to study the plague in Venice, but more importantly, to study, compare and fit this data to potentially have the first real world spatio-temporal model of the Plague.
First of all, we would like to describe the model we used to illustrate the spread of the plague epidemics in Venice. The beauty of the project lies in its duality: while modelling the propagation of a disease in a particular spatial environment or studying the temporal evolution of the infection with differential equation has been has already been achieved, we failed to find any relevant literature that combines the two.
To solve this problem, we came up with an innovative approach that takes into account both spatial and temporal dimensions in order to accurately model the spread of the epidemics in the context of Venice. We decided to represent the city as a network of nodes isolated from the GIS geolocalisations of street intersections, which we believe to be an accurate representation of the local population densities that could be found at given points of the city. We initialized the number of people per node to fit the total population of Venice at that time (~140000 people). The nodes govern the spatial component of the spread and contain three types of sub-populations: the susceptible, infected and removed sub-populations. At each time step, susceptible and infected people are able to move to another node with a probability depending on the kernel used (exponential or power law). The probability of moving from one node to another is thus dependent on the inter-nodal distance, which is suited for a radial spread of the disease. To lower the computational costs linked to the large number of nodes and possible movements, we implemented an alias method which permits a much quicker computation of these probabilities. This method requires a relatively costly initialization, but yields instantaneous-like computation at run time (c.f blog post 3). To reduce computation costs, we also decided to consider only ~20% of all the nodes included in the GIS data. We were able to select only part of the nodes while keeping a subset that is still representative of the organisation of the city.
Concerning the temporal evolution of the disease at each node, we implemented a local SIR model which accurately represents the contamination probability in a local population. The simulation strategy is the following:
At a given time step of the epidemics wave, the local population of each node is defined by:
- Ni(t): total population at a given node
- Si(t): number of susceptible individuals inside this local population
- Ii(t): number of infected individuals inside this local population
- Ri(t): number of individuals that cannot infect/be infected anymore by the disease (most probably because of death in the case of the Plague)
The SIR model defines at any time step: Ni(t) = Si(t) + Ii(t) + Ri(t)
For each time step of the simulation:
1st update of the different categories of local populations according to an SIR model at each node:
- dS/dt = -β*S(t)*I(t)
- dI/dt = β*S(t)*I(t) – γ*I(t)
- dR/dt = γ*I(t)
With 1/γ = Average infectious period (defined with respect to time step); β = contact rate. The disease has a particular dynamics as there is a timespan where somebody is infected and doesn’t know it and still moves around and a timespan where the disease is striking and generally less moving. The portion of the population spreading the disease without knowing it can be influenced by the average infectious period.
2nd update of local populations according to potential movements of individuals to other nodes: each node is given a proportion of moving individuals (within the susceptible and infected populations) according to:
– its local population
– the number of affected individuals inside it
Although this simple model is adapted to the spatial context of Venice and to the plague itself, it is not flawless since it is memoryless. Hence, the probability of an infected person to be removed is not dependent on the time it has already been infected. We hope that this flaw will a priori be justified by the actual heterogeneity in a population (i.e small children will be able to carry the bacterial burden for much less time than a healthy adult). As mentioned before, the ultimate goal of this project is to train and test the model to fit actual data from the XVIth and XVIIth century. More precisely, the Provveditori alla Sanita register contains the following information regarding each death:
- The date and place of the registration
- The name of the deceased
- The name of the deceased’s father (in case of homonym, also the grandfather’s name)
- In case of women, the husband’s name and the civil status (married or widow)
- The age
- The description of the death cause
- The place of death and eventually the transportation
- The name of the doctor examining the body and validating death
- The burial place
Therefore, we could both modulate the initial condition (population at each node) as well the components of the SIR’s differential equations (personalized average infectious period, contact rate and optimal time step based on the individual characteristics).
We started our project with Matlab to test out our model and decided to switch to the C++ language for various reasons. The choice of a more object oriented language was justified by the increase in speed of computation, the better definition of structures and classes and a more flexible programing environment. Moreover, this allows easier post-final report changes if our research is used in the future. More importantly, this would make it really easy to implement a class “Individual” where specific mobility and pandemics aspects could be affected individually, as well as a notion of memory (path visited, time of infection etc.). This is also a way to bring more diversity, variability and realism to the simulation as susceptibility criteria can be implemented for each individual to further increase the resemblance to a real epidemic outbreak. We chose a model with high level of abstraction because it is convenient as it allows the regulation of random jumps, but as we stated before, a model with a lower abstraction level with individuals having specific comportments and characteristics could dramatically change the simulation outcome. Even if necrology registers are not available just yet, one could implement the Individual class and compare the results of a more deterministic model to ours. As an example, one can look at the registers to obtain population densities for each of the parishes and its evolution during the plague waves to fit this initial constants in our model. Since there is not enough time remaining for implementing such an intricate notion in our model we decided to continue with our initial thought of developing a SIR model at the node level.
To continue with, we would like to discuss more about the visualization tools as well as the coding components of the program. The visualization interface was done using wxWidgets and contains: a slider to modify the contact rate and the infection period, a button to choose the kernel (power law and exponential law), a start and pause button, a log file generator to retrace the evolution of the simulation, two maps provided with recording tools (a map containing the nodes, with the colour indicating the SIR state of the node: green for majority of susceptible, red for a majority of infected, blue for a majority of removed and black for an empty node; a heatmap with a continuous representation of one of the subpopulations) and a graph representing the temporal evolution of the total amount of each sub-population. The software provides tools to verify and troubleshoot the model: random generation of nodes in a circular fashion, selection of the node to be infected. The model is mostly a probabilistic model and for this reason we added a fast mode to the program so that it can run as fast as possible. It is then possible to extract and export the results and compare the outcomes found after multiples runs. We also added three sliders controlling the mobility of each sub-populations (Susceptible, Infected and Removed), which works by modulation of the kernel coefficients and thus introducing a bias in the probabilities.
We introduced the notion of node attractiveness in our model so that once there is more accurate data about the organisation of Venice during the plague outbreaks one could change the attractiveness of particular nodes depending on their role in the city (hospital, cemetery, central places, churches).We allow the user to set the attractiveness of a particular node with a simple click on the node. This results in a modification in the calculation of the probabilities in the building of alias and probability matrices. Even though Venice citizens had poor medical knowledge about the Plague, they are known to be pioneers in some domains such as quarantine. One could use the notion of attractor and repulsive nodes to mimic a quarantine procedure. Concerning the travelling, we chose to keep the removed population immobile but this could be easily changed to get a more realistic simulation as we know that corpses were moved around the city. Depending on the simulation parameters, we can already observe in some cases that some nodes get deserted. One could try to force such behaviours using nodes attractiveness but could also implement more elaborate travelling specific to each sub-populations or even implement escape travelling of susceptible individuals and newly infected individuals that are not yet aware of it. We know that in the context of Venice this particular feature would not make much sense as citizens mostly thought that the Plague was due to evil spirits but this would still be a really nice addition to the program.
Finally, we would like to show some results we obtained using different parameters for the model: we can observe with the examples given that the infectious spread seems to fit the topology of Venice and also capture the major hallmarks of an infection propagation such as a pronounced susceptibility of infection in central locations.
Depending on the kernel, we can accurately simulate different epidemic behaviours. In particular, we achieved a simulation of a wave-front disease propagating from an initial center of infection (due to a highly restrictive contact kernel, e.g. an exponential law) as opposed to a nonlocal spread of the disease, with many satellite outbreaks and no clear wave-front pattern (due to a less restrictive contact kernel such as a power law). The position of the first outbreak has great importance as the spread can stay isolated if the node was in a remote location or in the periphery. Nevertheless, we still observed a tendency of propagation towards highly populated zones. Overall, the model is capable of delivering stable and unstable patterns of propagation often observed when computing non-linear equations.
To continue with, we came up with a lot of ideas during this semester and even if we would be thrilled to spend more time to develop our software, the list of possible improvements to this project is endless. The end of this report is dedicated to an overview of those ideas and potential future improvements. The software includes an XML Parser that can load and save any predefined system with specific nodes and populations and can easily be extended to include edges (specified link between nodes). This parser could also be used to load characteristics for each Individual if this class get implemented. This particular feature was included for convenience and to encourage further development.
Random jumps (or long travelling) were one of the requirements to get a realistic model. This requirement was fulfilled with no effort as the travelling probabilities of our model are dependent on the inter-nodal distance. Random jumps are happening at a frequency that can be adjusted by the kernel type and kernel coefficient. But if edges are implemented the model becomes a small-world network and the problem get more complex as there is a lot of things to think through:
- Edges are efficient in term of resources as there is no need to compute traveling probabilistic between every single node. In our project, we had to rely on the alias method to compute and store all the probabilities of the system as individual can potentially go from one to any other node.
- With edges, random jumps/large scale travelling cannot be simply activated by adjusting a parameter. A choice has to be made concerning the travelling: in the context of Venice, it is possible to rely on the canals and create long edges, representing maritime routes that can cross the entire map or simply manually add random jumps to any other node artificially.
To conclude, we would like to acknowledge Giovanni Colavizza for his guidance and helpful comments. With his help, we were able to provide a C++ cross-platform application for modelling and visualization of the spread of epidemics. The software has a user-friendly interface to regulate and fine tune the different parameters of the simulation. We tried to make it modular enough to handle or be adapted with varying scenarios. We intend to give the opportunity to the college of humanities of EPFL to use our program protected by a copyright. We would be thankful if our names would be included in any hypothetical publication using our innovative program. A dedicated help file is provided to guide the user through the setup process of all the necessary components to run and/or modify the program. We hope that this project will be useful to others and that its development will not stop there.