Checkpointing support allows program execution to roll-back to an earlier program point, discarding any modifications made since that point. Existing software-based checkpointing methods are mainly libraries that snapshot all of working-memory, and hence have prohibitive overhead for many potential applications. In this thesis we present a light-weight, fine-grain checkpointing framework implemented entirely in software through compiler transformations and optimizations. A programmer can specify arbitrary checkpoint regions via a simple API, and the compiler automatically transforms the code to implement the checkpoint at the granularity of individual stores, optimizing to remove redundancy. We explore three application areas for t...
Checkpointing is widely used in robust fault-tolerant applications. We present an efficient incremen...
Checkpointing is a pivotal technique in system research, with applications ranging from crash recove...
As modern supercomputing systems reach the peta-flop perfor-mance range, they grow in both size and ...
This thesis examines the feasibility of applying compile-time information to assist in rollback reco...
textTo make progress in the face of failures, long-running parallel applications need to save their ...
In this paper we present compiler-assisted checkpointing, a new technique which uses static program ...
The contributions of this paper are the following. We describe the implementation of the $C^3$ syst...
Abstract. As modern supercomputing systems reach the peta-flop performance range, they grow in both ...
High-frequency memory checkpointing is an important technique in several application domains, such a...
This paper describes our experience with the implementation and applications of the Unix checkpointi...
This paper describes our experience with the implementation and applications of the Unix checkpointi...
The contributions of this paper are the following. • We describe the implementation of the C3 system...
With processor vendors pursuing multicore products, often at the expense of the complexity and aggre...
The contributions of this paper are the following. • We describe the implementation of the C3 system...
Checkpointing is a pivotal technique in system research, with applications ranging from crash recove...
Checkpointing is widely used in robust fault-tolerant applications. We present an efficient incremen...
Checkpointing is a pivotal technique in system research, with applications ranging from crash recove...
As modern supercomputing systems reach the peta-flop perfor-mance range, they grow in both size and ...
This thesis examines the feasibility of applying compile-time information to assist in rollback reco...
textTo make progress in the face of failures, long-running parallel applications need to save their ...
In this paper we present compiler-assisted checkpointing, a new technique which uses static program ...
The contributions of this paper are the following. We describe the implementation of the $C^3$ syst...
Abstract. As modern supercomputing systems reach the peta-flop performance range, they grow in both ...
High-frequency memory checkpointing is an important technique in several application domains, such a...
This paper describes our experience with the implementation and applications of the Unix checkpointi...
This paper describes our experience with the implementation and applications of the Unix checkpointi...
The contributions of this paper are the following. • We describe the implementation of the C3 system...
With processor vendors pursuing multicore products, often at the expense of the complexity and aggre...
The contributions of this paper are the following. • We describe the implementation of the C3 system...
Checkpointing is a pivotal technique in system research, with applications ranging from crash recove...
Checkpointing is widely used in robust fault-tolerant applications. We present an efficient incremen...
Checkpointing is a pivotal technique in system research, with applications ranging from crash recove...
As modern supercomputing systems reach the peta-flop perfor-mance range, they grow in both size and ...