The aim of the project is to create a cheap storage solution for DHers, that would be constituted of individual monoliths. These would ensure local data reliability, data redundancy over the network of monoliths, and should be viewable as a single database of information constituting The Great Common of human digitalized cultural content.
We aim to describe a possible system implementing all of the above.
The traditional way to deal with the issue of data storage is to use a NAS (Network attached Storage), comprised of a RAID (redundant array of independent disks) to ensure protection against disk failure.
However, we are thinking about a decentralized aggregation of the data of all the monoliths instead, thus allowing anyone to access The Great Common of cultural data as it is digitalized, through some kind of database explorer software.
- In the case in which we use the hardware RAID, the idea was to use the Raspberry Pi (around 35$), because it is powerful enough while still being affordable in terms of price.
- However, the Raspberry Pi is not powerful enough for ZFS. The alternative is in small footprint, low power computer motherboards (mini ITX for example, around 60-80€ with integrated CPU), slightly less affordable, but that would allow us to use ZFS, an therefore, have a more “solid” system. In addition, the headroom in computing power would also allow for the use of heavier software that could be useful on the field for DHers.
- Who needs it?
- Why, and for what purpose?
- How would the storage and the access to the data be more convenient to you (local network only? internet (but private)? broad access over all the monoliths?)
- What level of reliability? (Local raid solution, network distributed redundancy?)
- If there are network capabilities between the monoliths, should they be in a specified work group, or global (in the Great Common idea).
- How big is the required/wanted storage?
- The stability of the software distribution and the specifically crafted scripts, programs, and settings.
- The computing power needed for the software to run seamlessly.
- The distributed network capabilities of many monoliths together, on just a single or a few computers.
- Stability of the systems against power outages (sudden shutting down of the virtual machine), hard disk failures (total or on the data), etc.
Then, after having sorted out all the bugs, and pinned down the computing resources needed, a small number of systems would be built, and also tested, with their network capabilities tested in conjunction with the virtual machines to emulate the network of a great number of monoliths.