System level virtualization provides several advan-tages: (i) customization is eased since virtual machines may be based on different systems; (ii) virtual ma-chines are isolated from hardware, subsequently appli-cations are isolated via the virtual machines; (iii) basic fault tolerance mechanisms – pro-active fault tolerance through virtual machine migration and virtual machine snapshot/restore; and (iv) basic load balancing mech-anisms – the capability to move and stop virtual ma-chines running in the system. However, the current Xen implementation does not natively provide mecha-nisms for virtual machine checkpoint/restart. This document presents the design of a reactive fault tolerant system, based on a checkpoint/restart mecha-nism for...
Checkpoint can store and recovery applications when faults happen and is becoming critical to large ...
As the size of supercomputers increases, the probability of system failure grows substantially, posi...
In this paper, we propose a TCP/IP Replication scheme for a fault tolerance system to provide high a...
With the ever-increasing dependence on computers and networks, many systems are required to be conti...
Virtualization is a key piece of modern data center design. Virtualization provides the possibility ...
Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At ...
Virtual machine checkpoints provide a clean encapsulation of the full state of an executing system....
System virtualization allows forthe consolidation of many physicalservers on a single physical host ...
Crash and omission failures are common in service providers: a disk can break down or a link can fai...
Abstract- In this work, we present the design of the Checkpointing-Enabled Virtual Machine (CEVM) ar...
Transparent hypervisor-level checkpoint-restart mechanisms for virtual clusters (VCs) or clusters of...
This study explores a recovery strategy using checkpointing in a distributed shared virtual memory (...
International audienceVirtual machine monitors (VMMs) play a crucial role in the software stack of c...
We have implemented a commercial enterprise-grade system for providing fault-tolerant virtual machin...
The ability to migrate a virtual machine (VM) from one physical host to another is important in a nu...
Checkpoint can store and recovery applications when faults happen and is becoming critical to large ...
As the size of supercomputers increases, the probability of system failure grows substantially, posi...
In this paper, we propose a TCP/IP Replication scheme for a fault tolerance system to provide high a...
With the ever-increasing dependence on computers and networks, many systems are required to be conti...
Virtualization is a key piece of modern data center design. Virtualization provides the possibility ...
Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At ...
Virtual machine checkpoints provide a clean encapsulation of the full state of an executing system....
System virtualization allows forthe consolidation of many physicalservers on a single physical host ...
Crash and omission failures are common in service providers: a disk can break down or a link can fai...
Abstract- In this work, we present the design of the Checkpointing-Enabled Virtual Machine (CEVM) ar...
Transparent hypervisor-level checkpoint-restart mechanisms for virtual clusters (VCs) or clusters of...
This study explores a recovery strategy using checkpointing in a distributed shared virtual memory (...
International audienceVirtual machine monitors (VMMs) play a crucial role in the software stack of c...
We have implemented a commercial enterprise-grade system for providing fault-tolerant virtual machin...
The ability to migrate a virtual machine (VM) from one physical host to another is important in a nu...
Checkpoint can store and recovery applications when faults happen and is becoming critical to large ...
As the size of supercomputers increases, the probability of system failure grows substantially, posi...
In this paper, we propose a TCP/IP Replication scheme for a fault tolerance system to provide high a...