In this paper we present compiler-assisted checkpointing, a new technique which uses static program analysis to optimize the performance of checkpointing. We achieve this performance gain using libckpt, a checkpointing library which implements memory exclusion in the context of user-directed checkpointing. The correctness of user-directed checkpointing is dependent on program analysis and insertion of memory exclusion calls by the programmer. With compiler-assisted checkpointing, this analysis is automated by a compiler or preprocessor. The resulting memory exclusion calls will optimize the performance of checkpointing, and are guaranteed to be correct. We provide a full description of our program analysis techniques and present detailed ex...
As modern supercomputing systems reach the peta-flop performance range, they grow in both size and c...
This paper describes a compiler-assisted approach for static checkpoint insertion. Instead of fixing...
Checkpointing is a pivotal technique in system research, with applications ranging from crash recove...
Abstract. As modern supercomputing systems reach the peta-flop performance range, they grow in both ...
Abstract. As modern supercomputing systems reach the peta-flop per-formance range, they grow in both...
As modern supercomputing systems reach the peta-flop performance range, they grow in both size and c...
The contributions of this paper are the following. We describe the implementation of the $C^3$ syst...
As modern supercomputing systems reach the peta-flop perfor-mance range, they grow in both size and ...
Checkpointing support allows program execution to roll-back to an earlier program point, discarding ...
The contributions of this paper are the following. • We describe the implementation of the C3 system...
The contributions of this paper are the following. • We describe the implementation of the C3 system...
This thesis examines the feasibility of applying compile-time information to assist in rollback reco...
textTo make progress in the face of failures, long-running parallel applications need to save their ...
For checkpointing to be practical, it has to introduce low overhead for the targeted application. As...
Checkpointing is a pivotal technique in system research, with applications ranging from crash recove...
As modern supercomputing systems reach the peta-flop performance range, they grow in both size and c...
This paper describes a compiler-assisted approach for static checkpoint insertion. Instead of fixing...
Checkpointing is a pivotal technique in system research, with applications ranging from crash recove...
Abstract. As modern supercomputing systems reach the peta-flop performance range, they grow in both ...
Abstract. As modern supercomputing systems reach the peta-flop per-formance range, they grow in both...
As modern supercomputing systems reach the peta-flop performance range, they grow in both size and c...
The contributions of this paper are the following. We describe the implementation of the $C^3$ syst...
As modern supercomputing systems reach the peta-flop perfor-mance range, they grow in both size and ...
Checkpointing support allows program execution to roll-back to an earlier program point, discarding ...
The contributions of this paper are the following. • We describe the implementation of the C3 system...
The contributions of this paper are the following. • We describe the implementation of the C3 system...
This thesis examines the feasibility of applying compile-time information to assist in rollback reco...
textTo make progress in the face of failures, long-running parallel applications need to save their ...
For checkpointing to be practical, it has to introduce low overhead for the targeted application. As...
Checkpointing is a pivotal technique in system research, with applications ranging from crash recove...
As modern supercomputing systems reach the peta-flop performance range, they grow in both size and c...
This paper describes a compiler-assisted approach for static checkpoint insertion. Instead of fixing...
Checkpointing is a pivotal technique in system research, with applications ranging from crash recove...