The single-program multiple-data (SPMD) paradigm is becoming the most diffuse way to program commercial multicomputers. In this paper we demonstrate that for a wide class of SPMD algorithms it is possible to achieve an efficient fault tolerance avoiding hardware redundancy. We propose a software approach that aims to reconfigure data, thus achieving a good slowdown in computation owing to the fine granularity of the workload to redistribute. In particular, we present and compare three data reconfiguration strategies applied to a problem model that includes a wide class of SPMD iterative algorithms characterized by nonlocal communications among the nodes. The result is that in most of the cases it is better to introduce some communication ov...
The working condition of a multicomputer system based on message passing communication is changeable...
International audienceDistributing applications over PC clusters to speed-up or size-up the executio...
International audienceThis chapter describes a multi-SPMD (mSPMD) programming model and a set of sof...
The single-program multiple-data (SPMD) paradigm is becoming the most diffuse way to program commerc...
The single-program multiple-data (SPMD) paradigm is becoming the most diffuse way to program commerc...
As the sizes of distributed memory multiprocessors increase, the likelihood of a fault removing one ...
The use of dynamic reconfiguration has been proposed to tolerate faults in large-scale partitionable...
The occurrence of faults in multicomputers with hundreds or thousands of nodes is a likely event tha...
Several parallel parallel processing systems exist that can be partitioned and/or can operate in mul...
Real parallel applications find little benefits from code portability that does not guarantee accept...
This paper describes a single-version algorithmic approach to design in fault tolerant computing in ...
. We present compiler optimization techniques for explicitly parallel programs that communicate thro...
Introduction A wide range of applications make use of regular dynamic data structures. Dynamic data...
Efficient parallel computing on distributed platforms still presents many obstacles. This paper addr...
In this paper we consider the problem of reconfiguring processor arrays subject to computational loa...
The working condition of a multicomputer system based on message passing communication is changeable...
International audienceDistributing applications over PC clusters to speed-up or size-up the executio...
International audienceThis chapter describes a multi-SPMD (mSPMD) programming model and a set of sof...
The single-program multiple-data (SPMD) paradigm is becoming the most diffuse way to program commerc...
The single-program multiple-data (SPMD) paradigm is becoming the most diffuse way to program commerc...
As the sizes of distributed memory multiprocessors increase, the likelihood of a fault removing one ...
The use of dynamic reconfiguration has been proposed to tolerate faults in large-scale partitionable...
The occurrence of faults in multicomputers with hundreds or thousands of nodes is a likely event tha...
Several parallel parallel processing systems exist that can be partitioned and/or can operate in mul...
Real parallel applications find little benefits from code portability that does not guarantee accept...
This paper describes a single-version algorithmic approach to design in fault tolerant computing in ...
. We present compiler optimization techniques for explicitly parallel programs that communicate thro...
Introduction A wide range of applications make use of regular dynamic data structures. Dynamic data...
Efficient parallel computing on distributed platforms still presents many obstacles. This paper addr...
In this paper we consider the problem of reconfiguring processor arrays subject to computational loa...
The working condition of a multicomputer system based on message passing communication is changeable...
International audienceDistributing applications over PC clusters to speed-up or size-up the executio...
International audienceThis chapter describes a multi-SPMD (mSPMD) programming model and a set of sof...