The contributions of this paper are the following. • We describe the implementation of the C3 system for semi-automatic application-level checkpointing of C programs. The system has (i) a pre-compiler that instruments C programs so that they can save their states at program execution points spec-ified by the user, and (ii) a novel memory allocator that manages the heap as a collection of pools. • We describe two static analyses for reducing the overhead of sav-ing and restoring the application state. The first one optimizes stack variables, while the second one optimizes heap data struc-tures. • To benchmark our system, we compare the overheads introduced by our semi-automatic approach with the overhead of handwrit-ten application-level che...
This paper focuses on the performance evaluation of Compiler for Portable Checkpointing (CPPC), a to...
Trends in high-performance computing are making it nec-essary for long-running applications to toler...
Checkpoint is defined as a designated place in a program at which normal processing is interrupted s...
The contributions of this paper are the following. • We describe the implementation of the C3 system...
The contributions of this paper are the following. We describe the implementation of the $C^3$ syst...
Abstract. As modern supercomputing systems reach the peta-flop performance range, they grow in both ...
In this paper we present compiler-assisted checkpointing, a new technique which uses static program ...
This thesis examines the feasibility of applying compile-time information to assist in rollback reco...
Checkpoint and Recovery (CPR) systems have many uses in high-performance computing. Because of this,...
Checkpointing support allows program execution to roll-back to an earlier program point, discarding ...
Abstract. As modern supercomputing systems reach the peta-flop per-formance range, they grow in both...
As modern supercomputing systems reach the peta-flop perfor-mance range, they grow in both size and ...
textTo make progress in the face of failures, long-running parallel applications need to save their ...
With the evolution of high-performance computing towards heterogeneous, massively par-allel systems,...
As modern supercomputing systems reach the peta-flop performance range, they grow in both size and c...
This paper focuses on the performance evaluation of Compiler for Portable Checkpointing (CPPC), a to...
Trends in high-performance computing are making it nec-essary for long-running applications to toler...
Checkpoint is defined as a designated place in a program at which normal processing is interrupted s...
The contributions of this paper are the following. • We describe the implementation of the C3 system...
The contributions of this paper are the following. We describe the implementation of the $C^3$ syst...
Abstract. As modern supercomputing systems reach the peta-flop performance range, they grow in both ...
In this paper we present compiler-assisted checkpointing, a new technique which uses static program ...
This thesis examines the feasibility of applying compile-time information to assist in rollback reco...
Checkpoint and Recovery (CPR) systems have many uses in high-performance computing. Because of this,...
Checkpointing support allows program execution to roll-back to an earlier program point, discarding ...
Abstract. As modern supercomputing systems reach the peta-flop per-formance range, they grow in both...
As modern supercomputing systems reach the peta-flop perfor-mance range, they grow in both size and ...
textTo make progress in the face of failures, long-running parallel applications need to save their ...
With the evolution of high-performance computing towards heterogeneous, massively par-allel systems,...
As modern supercomputing systems reach the peta-flop performance range, they grow in both size and c...
This paper focuses on the performance evaluation of Compiler for Portable Checkpointing (CPPC), a to...
Trends in high-performance computing are making it nec-essary for long-running applications to toler...
Checkpoint is defined as a designated place in a program at which normal processing is interrupted s...