SimFS

SimFS is a file system interface developed at the Scalable Parallel Computing Laboratory (SPCL), ETH Zürich that allows balancing of storage and computing resources for large scientific simulations generating petabytes of data. I contributed to this project during my BSc thesis in Computer Science.

Di Girolamo S, Schmid P, Schulthess T, Hoefler T
SimFS: A Simulation Data Virtualizing File System Interface
IEEE, May 2019, 33rd IEEE International Parallel & Distributed Processing Symposium (IPDPS'19)
simfs | my new Erdős number is 3 | more details and text | github

Schmid P
Scientific Simulations Analysis and Benchmarking
BSc thesis in Computer Science at the Scalable Parallel Computing Laboratory (SPCL), Department of Computer Science, ETH Zürich, Switzerland, supervision by Prof. Dr. Torsten Hoefler and Salvatore Di Girolamo.
thesis | Scalable Parallel Computing Laboratory (SPCL)

Scientific simulations can produce petabytes of data. This stored data is typically accessed for additional post-simulation analysis. Keeping all these data stored for years may be neither cost-efficient nor feasible dependent on the projects and available resources. With computing power becoming less expensive, it may be advantageous to keep only a subset of the data stored and re-simulate other data on demand.

SimFS introduces a virtualization layer – which is fully transparent to the simulation software and the post-simulation analysis software – that can manage the data files and re-launch new simulations on demand.

Detailed description of the system including benchmarking on Piz Daint (CSCS), cost-analysis, link to the source code, and additional features are shown in the publication.