In this paper, we propose an efficient scheduling algorithm for problems in which tasks with precedence constraints and communication delays have to be scheduled on an heterogeneous distributed system with an one fault hypothesis. Based on an extension of the Critical-Path Method CPM/PERT, our algorithm combines an optimal schedule with some additional tasks duplication, to provide fault-tolerance. Backup copies are not established for tasks that have already more than one original copy. The result is a schedule in polynomial time that is optimal when there is no failure, and is a good resilient schedule in the case of one server failure. We finally compare the optimal solutions with the resilient solutions found by this algorithm on severa...
Our goal is to automatically obtain a distributed and fault-tolerant embedded system: distributed be...
Proc. of the 37th IEEE Intenational Conference on parallel Processing (ICPP 2008) IEEE Computer Soci...
One of the main problems in distributed high-performance computing is how to allocate, schedule, ef...
7 pagesInternational audienceIn this paper, we propose an efficient scheduling algorithm for problem...
CDInternational audienceBecause fault failures tend to affect whole areas, in some cases, and not on...
International audienceHeterogeneous distributed systems are widely deployed for executing computatio...
AbstractMost list scheduling heuristics rely on a simple platform model wherecommunication contentio...
International audienceLatency, fault tolerance and reliability are important requirements for severa...
Latency, fault tolerance and reliability are important requirements for several applications that ar...
In distributed systems, a real-time task has several subtasks which need to be executed at different...
This report provides an introduction to the design of scheduling algorithms to cope with faults on l...
Real-time systems are being extensively used in applications that are mission-critical and life-crit...
Often hard real-time systems require results that are produced on time despite the occurrence of pro...
(eng) Fault tolerance and latency are important requirements in several applications which are time ...
Fault tolerance and latency are important requirements in several applications which are time critic...
Our goal is to automatically obtain a distributed and fault-tolerant embedded system: distributed be...
Proc. of the 37th IEEE Intenational Conference on parallel Processing (ICPP 2008) IEEE Computer Soci...
One of the main problems in distributed high-performance computing is how to allocate, schedule, ef...
7 pagesInternational audienceIn this paper, we propose an efficient scheduling algorithm for problem...
CDInternational audienceBecause fault failures tend to affect whole areas, in some cases, and not on...
International audienceHeterogeneous distributed systems are widely deployed for executing computatio...
AbstractMost list scheduling heuristics rely on a simple platform model wherecommunication contentio...
International audienceLatency, fault tolerance and reliability are important requirements for severa...
Latency, fault tolerance and reliability are important requirements for several applications that ar...
In distributed systems, a real-time task has several subtasks which need to be executed at different...
This report provides an introduction to the design of scheduling algorithms to cope with faults on l...
Real-time systems are being extensively used in applications that are mission-critical and life-crit...
Often hard real-time systems require results that are produced on time despite the occurrence of pro...
(eng) Fault tolerance and latency are important requirements in several applications which are time ...
Fault tolerance and latency are important requirements in several applications which are time critic...
Our goal is to automatically obtain a distributed and fault-tolerant embedded system: distributed be...
Proc. of the 37th IEEE Intenational Conference on parallel Processing (ICPP 2008) IEEE Computer Soci...
One of the main problems in distributed high-performance computing is how to allocate, schedule, ef...