In this talk we will present the new buddy-checkpointing feature of SIONlib, which is designed to support checkpointing of simulation data on node-local storage for application that uses a task-local I/O pattern. With this new feature, SIONlib is able to store data in a virtual shared file container on local storage of the compute nodes, whereas local data is automatically and transparently mirrored to local storage of a set of buddy nodes. The buddy-checkpointing feature of SIONlib has been developed within the EU-project DEEP-ER to support task-local I/O on hierarchical I/O infrastructures. Additionally, SIONlib can work together with the multi-level checkpoint library SCR to support b...
Abstract—Massively parallel applications often require periodic data checkpointing for program resta...
Checkpoint is defined as a designated place in a program at which normal processing is interrupted s...
The next generation of capability-class, massively parallel processing (MPP) systems is expected to ...
Applications on current large-scale HPC systems use enormous numbers of processing elements for thei...
Abstract: Checkpointing is a procedure of storing process state to a file, which is later used to re...
In this talk we will present the recent developments of the benchmarking environment JUBE, the batch...
Applications on current large-scale HPC systems use enormous numbers of processing elements for thei...
As illustrated by the IO500 ranking performance and scalability of storage systems have improved dra...
Altres ajuts: acord transformatiu CRUE-CSICDue to the increase and complexity of computer systems, r...
The efficient utilization of current supercomputing systems with deep storage hierarchies demands sc...
A Parallel Single Level Store systems (PSLS) integrates a shared virtual memory and a parallel file ...
International audienceGlobal checkpointing to external storage (e.g., a parallel file system) is a c...
The advent of cluster computing has resulted in a thrust towards providing software mechanisms for r...
International audienceEfficient checkpointing of distributed data structures periodically at key mom...
We present a new approach to handling the demanding I/O workload incurred during checkpoint writes e...
Abstract—Massively parallel applications often require periodic data checkpointing for program resta...
Checkpoint is defined as a designated place in a program at which normal processing is interrupted s...
The next generation of capability-class, massively parallel processing (MPP) systems is expected to ...
Applications on current large-scale HPC systems use enormous numbers of processing elements for thei...
Abstract: Checkpointing is a procedure of storing process state to a file, which is later used to re...
In this talk we will present the recent developments of the benchmarking environment JUBE, the batch...
Applications on current large-scale HPC systems use enormous numbers of processing elements for thei...
As illustrated by the IO500 ranking performance and scalability of storage systems have improved dra...
Altres ajuts: acord transformatiu CRUE-CSICDue to the increase and complexity of computer systems, r...
The efficient utilization of current supercomputing systems with deep storage hierarchies demands sc...
A Parallel Single Level Store systems (PSLS) integrates a shared virtual memory and a parallel file ...
International audienceGlobal checkpointing to external storage (e.g., a parallel file system) is a c...
The advent of cluster computing has resulted in a thrust towards providing software mechanisms for r...
International audienceEfficient checkpointing of distributed data structures periodically at key mom...
We present a new approach to handling the demanding I/O workload incurred during checkpoint writes e...
Abstract—Massively parallel applications often require periodic data checkpointing for program resta...
Checkpoint is defined as a designated place in a program at which normal processing is interrupted s...
The next generation of capability-class, massively parallel processing (MPP) systems is expected to ...