International audienceThis paper deals with the impact of fault prediction techniques on checkpointing strategies. We extend the classical first-order analysis of Young and Daly in the presence of a fault prediction system, characterized by its recall and its precision. In this framework, we provide optimal algorithms to decide whether and when to take predictions into account, and we derive the optimal value of the checkpointing period. These results allow us to analytically assess the key parameters that impact the performance of fault predictors at very large scale
This report provides an introduction to resilience methods. The emphasis is on checkpointing, the de...
International audienceThis paper investigates the optimal number of processors to execute a parallel...
International audienceWith increasing scale and complexity of supercomputing and cloud computing arc...
International audienceThis paper deals with the impact of fault prediction techniques on checkpointi...
You should rather read RR-8237 and RR-8239 which cover the integrity of this report in a more precis...
International audienceThis paper deals with the impact of fault prediction techniques on checkpointi...
International audienceThe Young/Daly formula provides an approximation of the optimal checkpoint per...
International audienceThis work provides an optimal checkpointing strategy to protect iterative appl...
Abstract—This paper deals with the impact of fault predic-tion techniques on checkpointing strategie...
International audienceThis article revisits checkpointing strategies when workflows composed of mult...
International audienceIn this paper, we present a unified model for several well-known checkpoint/re...
International audienceIn this paper, we revisit traditional checkpointing and rollback recovery stra...
This paper revisits checkpointing strategies when workflows composed of multiple tasks execute on a ...
International audienceThis work provides an analysis of checkpointing strategies for minimizing expe...
This report provides an introduction to resilience methods. The emphasis is on checkpointing, the de...
International audienceThis paper investigates the optimal number of processors to execute a parallel...
International audienceWith increasing scale and complexity of supercomputing and cloud computing arc...
International audienceThis paper deals with the impact of fault prediction techniques on checkpointi...
You should rather read RR-8237 and RR-8239 which cover the integrity of this report in a more precis...
International audienceThis paper deals with the impact of fault prediction techniques on checkpointi...
International audienceThe Young/Daly formula provides an approximation of the optimal checkpoint per...
International audienceThis work provides an optimal checkpointing strategy to protect iterative appl...
Abstract—This paper deals with the impact of fault predic-tion techniques on checkpointing strategie...
International audienceThis article revisits checkpointing strategies when workflows composed of mult...
International audienceIn this paper, we present a unified model for several well-known checkpoint/re...
International audienceIn this paper, we revisit traditional checkpointing and rollback recovery stra...
This paper revisits checkpointing strategies when workflows composed of multiple tasks execute on a ...
International audienceThis work provides an analysis of checkpointing strategies for minimizing expe...
This report provides an introduction to resilience methods. The emphasis is on checkpointing, the de...
International audienceThis paper investigates the optimal number of processors to execute a parallel...
International audienceWith increasing scale and complexity of supercomputing and cloud computing arc...