The application of checkpointing as a fault-tolerance measure for real-time services (i.e., services that are restricted by deadlines) raises differ-ent problems than dealing with non-real-time services. Therefore, different criteria are necessary for the assessment of checkpointing in such an envi-ronment. The probability of correct execution before the deadline even in presence of faults (responsiveness) is such a criterion for real-time services. Many checkpointing methods use equidistant checkpoints. For such methods, the number of checkpoints taken during execution time is the con-trolled parameter for maximizing responsiveness. In this paper, we give an exact analysis of the responsiveness of real-time services with checkpoint-ing, us...
The massive scale of current and next-generation massively parallel processing (MPP) systems present...
International audienceLarge scale applications running on new computing plat- forms with thousands o...
When using primary-backup replication, one checkpoints the primary’s state to reduce the failover ti...
The application of checkpointing as a fault-tolerance measure for real-time services (i.e., services...
AbstractCheckpointing mechanism is used to tolerate the impact of transient faults by rollback opera...
To combat the increasing soft error rates in recent semiconductor technologies, it is important to e...
Employing fault tolerance often introduces a time overhead, which may cause a deadline violation in ...
International audienceParallel execution time is expected to decrease as the number of processors in...
This report provides an introduction to the design of scheduling algorithms to cope with faults on l...
The large scale of current and next-generation massively parallel processing (MPP) systems presents ...
Employing fault tolerance often introduces a time overhead, which may cause a deadline violation in ...
The massive scale of current and next-generation massively parallel processing (MPP) systems present...
For the vast majority of computer systems correct operation is defined as producing the correct resu...
Correct operation of real-time systems (RTS) is defined as producing correct results within given ti...
Increasing soft error rates in recent semiconductor technologies enforce the usage of fault toleranc...
The massive scale of current and next-generation massively parallel processing (MPP) systems present...
International audienceLarge scale applications running on new computing plat- forms with thousands o...
When using primary-backup replication, one checkpoints the primary’s state to reduce the failover ti...
The application of checkpointing as a fault-tolerance measure for real-time services (i.e., services...
AbstractCheckpointing mechanism is used to tolerate the impact of transient faults by rollback opera...
To combat the increasing soft error rates in recent semiconductor technologies, it is important to e...
Employing fault tolerance often introduces a time overhead, which may cause a deadline violation in ...
International audienceParallel execution time is expected to decrease as the number of processors in...
This report provides an introduction to the design of scheduling algorithms to cope with faults on l...
The large scale of current and next-generation massively parallel processing (MPP) systems presents ...
Employing fault tolerance often introduces a time overhead, which may cause a deadline violation in ...
The massive scale of current and next-generation massively parallel processing (MPP) systems present...
For the vast majority of computer systems correct operation is defined as producing the correct resu...
Correct operation of real-time systems (RTS) is defined as producing correct results within given ti...
Increasing soft error rates in recent semiconductor technologies enforce the usage of fault toleranc...
The massive scale of current and next-generation massively parallel processing (MPP) systems present...
International audienceLarge scale applications running on new computing plat- forms with thousands o...
When using primary-backup replication, one checkpoints the primary’s state to reduce the failover ti...