One of the major challenges in wide use of Grid workflow systems is fault tolerance and avoidance. Checkpointing schemes provide a way of fault detection and recovery. In our research, we focus on the performance optimization of checkpointing schemes and dynamic voltage scaling (DVS) for Grid workflow systems. We propose offline checkpointing schemes with DVS and online adaptive checkpointing schemes that dynamically adjust the checkpointing intervals by using store checkpoints and compare checkpoints. When combined with DVS, offline adaptive checkpointing schemes not only are fault tolerant but also lead to reduce average execution time of tasks. These schemes can efficiently utilize comparison and storage operations and significantly impr...
Using additional store-checkpoinsts (SCPs) and compare-checkpoints (CCPs), we present an adaptive ch...
In grid workflow systems, to verify fixed-time constraints efficiently at the run-time execution sta...
In this paper, we present a checkpoint-based scheme to improve the turnaround time of bag-of-tasks a...
One of the major challenges in wide use of Grid workflow systems is fault tolerance and avoidance. C...
Scientific workflows are data- and compute-intensive; thus, they may run for days or even weeks...
Jobs in Grid workflows are exposed to different types of failure. It is important to develop fault t...
As grids typically consist of autonomously managed subsystems with strongly varying resources, fault...
In grid workflow systems, a checkpoint selection strategy is responsible for selecting checkpoints f...
A grid is a distributed computational and storage environment often composed of heterogeneous autono...
Adaptive checkpointing is a relatively new approach that is particularly suitable for providing faul...
In grid workflow systems, a checkpoint selection strategy is responsible for selecting checkpoints f...
Scientific workflows are data- and compute-intensive; thus, they may run for days or even weeks on p...
Using additional store-checkpoinsts (SCPs) and compare-checkpoints (CCPs), we present an adaptive ch...
In grid workflow systems, existing representative checkpoint selection strategies, which are used to...
Abstract — Checkpointing is a typical approach to tolerate failures in today’s supercomputing cluste...
Using additional store-checkpoinsts (SCPs) and compare-checkpoints (CCPs), we present an adaptive ch...
In grid workflow systems, to verify fixed-time constraints efficiently at the run-time execution sta...
In this paper, we present a checkpoint-based scheme to improve the turnaround time of bag-of-tasks a...
One of the major challenges in wide use of Grid workflow systems is fault tolerance and avoidance. C...
Scientific workflows are data- and compute-intensive; thus, they may run for days or even weeks...
Jobs in Grid workflows are exposed to different types of failure. It is important to develop fault t...
As grids typically consist of autonomously managed subsystems with strongly varying resources, fault...
In grid workflow systems, a checkpoint selection strategy is responsible for selecting checkpoints f...
A grid is a distributed computational and storage environment often composed of heterogeneous autono...
Adaptive checkpointing is a relatively new approach that is particularly suitable for providing faul...
In grid workflow systems, a checkpoint selection strategy is responsible for selecting checkpoints f...
Scientific workflows are data- and compute-intensive; thus, they may run for days or even weeks on p...
Using additional store-checkpoinsts (SCPs) and compare-checkpoints (CCPs), we present an adaptive ch...
In grid workflow systems, existing representative checkpoint selection strategies, which are used to...
Abstract — Checkpointing is a typical approach to tolerate failures in today’s supercomputing cluste...
Using additional store-checkpoinsts (SCPs) and compare-checkpoints (CCPs), we present an adaptive ch...
In grid workflow systems, to verify fixed-time constraints efficiently at the run-time execution sta...
In this paper, we present a checkpoint-based scheme to improve the turnaround time of bag-of-tasks a...