AbstractCheckpointing mechanism is used to tolerate the impact of transient faults by rollback operation. Recently, it has also been used as a mechanism to enhance system's lifetime by identifying and tolerating permanent fault 5,19,10,12. However, equidistant checkpoint interval may cause task deadline violation in the system. Here, we propose an adaptive checkpoint interval placement algorithm (ADeLiRACI) that meets all tasks deadline. The checkpoint intervals are adjusted to minimize the impact of stresses and permanent faults on the running hosts. This novel mechanism allows greater applicability in real time systems with hard deadline such as weather prediction, financial transactions etc. We compare the estimated completion time for i...
Checkpointing is commonly adopted for enhancing the performance of software applications that operat...
Scientific workflows are data- and compute-intensive; thus, they may run for days or even weeks...
Checkpointing Rollback Recovery protocol is often used to provide fault tolerance for real-time appl...
AbstractCheckpointing mechanism is used to tolerate the impact of transient faults by rollback opera...
The application of checkpointing as a fault-tolerance measure for real-time services (i.e., services...
The application of checkpointing as a fault-tolerance measure for real-time services (i.e., services...
The large scale of current and next-generation massively parallel processing (MPP) systems presents ...
This report provides an introduction to the design of scheduling algorithms to cope with faults on l...
The massive scale of current and next-generation massively parallel processing (MPP) systems present...
Since the last decade, computing systems turn to large scale parallel platforms composed of thousand...
Adaptive checkpointing is a relatively new approach that is particularly suitable for providing faul...
Correct operation of real-time systems (RTS) is defined as producing correct results within given ti...
Checkpointing is a common technique for reducing the time to recover from faults in computer systems...
Abstract—This paper deals with the impact of fault predic-tion techniques on checkpointing strategie...
Checkpointing is a common technique for reducing the time to recover from faults in computer systems...
Checkpointing is commonly adopted for enhancing the performance of software applications that operat...
Scientific workflows are data- and compute-intensive; thus, they may run for days or even weeks...
Checkpointing Rollback Recovery protocol is often used to provide fault tolerance for real-time appl...
AbstractCheckpointing mechanism is used to tolerate the impact of transient faults by rollback opera...
The application of checkpointing as a fault-tolerance measure for real-time services (i.e., services...
The application of checkpointing as a fault-tolerance measure for real-time services (i.e., services...
The large scale of current and next-generation massively parallel processing (MPP) systems presents ...
This report provides an introduction to the design of scheduling algorithms to cope with faults on l...
The massive scale of current and next-generation massively parallel processing (MPP) systems present...
Since the last decade, computing systems turn to large scale parallel platforms composed of thousand...
Adaptive checkpointing is a relatively new approach that is particularly suitable for providing faul...
Correct operation of real-time systems (RTS) is defined as producing correct results within given ti...
Checkpointing is a common technique for reducing the time to recover from faults in computer systems...
Abstract—This paper deals with the impact of fault predic-tion techniques on checkpointing strategie...
Checkpointing is a common technique for reducing the time to recover from faults in computer systems...
Checkpointing is commonly adopted for enhancing the performance of software applications that operat...
Scientific workflows are data- and compute-intensive; thus, they may run for days or even weeks...
Checkpointing Rollback Recovery protocol is often used to provide fault tolerance for real-time appl...