Heretofore, automatic checkpointing at procedure-call boundaries, to reduce the space complexity of reverse mode, has been provided by systems like Tapenade. However, binomial checkpointing, or treeverse, has only been provided in Automatic Differentiation (AD) systems in special cases, e.g., through user-provided pragmas on DO loops in Tapenade, or as the nested taping mechanism in adol-c for time integration processes, which requires that user code be refactored. We present a framework for applying binomial checkpointing to arbitrary code with no special annotation or refactoring required. This is accomplished by applying binomial checkpointing directly to a program trace. This trace is produced by a general-purpose checkpointing mechanis...
textTo make progress in the face of failures, long-running parallel applications need to save their ...
The contributions of this paper are the following. • We describe the implementation of the C3 system...
Checkpointing is widely used in robust fault-tolerant applications. We present an efficient incremen...
Heretofore, automatic checkpointing at procedure-call boundaries, to reduce the space complexity of ...
Classical reverse-mode automatic differentiation (AD) imposes only a small constant-factor overhead ...
Abstract. This paper presents a new functionality of the Automatic Dierentiation (AD) Tool tapenade....
This paper presents a new functionality of the Automatic Differentiation (AD) Tool Tapenade. Tapenad...
Checkpointing support allows program execution to roll-back to an earlier program point, discarding ...
The contributions of this paper are the following. We describe the implementation of the $C^3$ syst...
In this paper we present compiler-assisted checkpointing, a new technique which uses static program ...
Checkpointing is a common technique for reducing the time to recover from faults in computer systems...
Abstract. As modern supercomputing systems reach the peta-flop performance range, they grow in both ...
Checkpointing tools may be typically implemented at two different abstraction levels: at the system ...
SIGLEAvailable from TIB Hannover: F02B139 / FIZ - Fachinformationszzentrum Karlsruhe / TIB - Technis...
Abstract. As modern supercomputing systems reach the peta-flop per-formance range, they grow in both...
textTo make progress in the face of failures, long-running parallel applications need to save their ...
The contributions of this paper are the following. • We describe the implementation of the C3 system...
Checkpointing is widely used in robust fault-tolerant applications. We present an efficient incremen...
Heretofore, automatic checkpointing at procedure-call boundaries, to reduce the space complexity of ...
Classical reverse-mode automatic differentiation (AD) imposes only a small constant-factor overhead ...
Abstract. This paper presents a new functionality of the Automatic Dierentiation (AD) Tool tapenade....
This paper presents a new functionality of the Automatic Differentiation (AD) Tool Tapenade. Tapenad...
Checkpointing support allows program execution to roll-back to an earlier program point, discarding ...
The contributions of this paper are the following. We describe the implementation of the $C^3$ syst...
In this paper we present compiler-assisted checkpointing, a new technique which uses static program ...
Checkpointing is a common technique for reducing the time to recover from faults in computer systems...
Abstract. As modern supercomputing systems reach the peta-flop performance range, they grow in both ...
Checkpointing tools may be typically implemented at two different abstraction levels: at the system ...
SIGLEAvailable from TIB Hannover: F02B139 / FIZ - Fachinformationszzentrum Karlsruhe / TIB - Technis...
Abstract. As modern supercomputing systems reach the peta-flop per-formance range, they grow in both...
textTo make progress in the face of failures, long-running parallel applications need to save their ...
The contributions of this paper are the following. • We describe the implementation of the C3 system...
Checkpointing is widely used in robust fault-tolerant applications. We present an efficient incremen...