As the sizes of distributed memory multiprocessors increase, the likelihood of a fault removing one of the processors from the system grows as well. Such a fault removes some or all of the following: (a) communication paths, (b) processing power, (c) topological consistency, and (d) progress of the running application. In this thesis we propose solutions to handle all of these problems.We handle the lost communication paths through table-based routing strategies. We present distributed algorithms for filling the tables, after a fault or repair, with the shortest communication paths surviving in the system. Also, we give algorithms for similarly filling broadcast tables and for guaranteeing that the routing tables are deadlock-free.To take c...
The single-program multiple-data (SPMD) paradigm is becoming the most diffuse way to program commerc...
In this paper, a new architecture of distributed embedded memory cores for SoC is proposed and an ef...
ISBN: 0769503713One way to improve reliability in parallel computers consists of adding supplementar...
As the sizes of distributed memory multiprocessors increase, the likelihood of a fault removing one ...
The use of dynamic reconfiguration has been proposed to tolerate faults in large-scale partitionable...
Various aspects of reliable computing are formalized and quantified with emphasis on efficient fault...
A network multicomputer is a multiprocessor in which the processors are connected by general-purpose...
The always increasing performance demands of applications such as cryptography, scientific simulatio...
In recent years the application space of reconfigurable devices has grown to include many platforms ...
In this paper, we present load redistribution algorithms for hypercubes in the presence of faults. O...
This thesis addresses the design of hypercube or hypercube-like message-passing computers that combi...
Several parallel parallel processing systems exist that can be partitioned and/or can operate in mul...
Designed algorithms that are useful for developing protocols and supporting tools for fault toleranc...
The single-program multiple-data (SPMD) paradigm is becoming the most diffuse way to program commerc...
Abstract: In this study, we‘ve analyzed and implemented three different algorithms developed for fau...
The single-program multiple-data (SPMD) paradigm is becoming the most diffuse way to program commerc...
In this paper, a new architecture of distributed embedded memory cores for SoC is proposed and an ef...
ISBN: 0769503713One way to improve reliability in parallel computers consists of adding supplementar...
As the sizes of distributed memory multiprocessors increase, the likelihood of a fault removing one ...
The use of dynamic reconfiguration has been proposed to tolerate faults in large-scale partitionable...
Various aspects of reliable computing are formalized and quantified with emphasis on efficient fault...
A network multicomputer is a multiprocessor in which the processors are connected by general-purpose...
The always increasing performance demands of applications such as cryptography, scientific simulatio...
In recent years the application space of reconfigurable devices has grown to include many platforms ...
In this paper, we present load redistribution algorithms for hypercubes in the presence of faults. O...
This thesis addresses the design of hypercube or hypercube-like message-passing computers that combi...
Several parallel parallel processing systems exist that can be partitioned and/or can operate in mul...
Designed algorithms that are useful for developing protocols and supporting tools for fault toleranc...
The single-program multiple-data (SPMD) paradigm is becoming the most diffuse way to program commerc...
Abstract: In this study, we‘ve analyzed and implemented three different algorithms developed for fau...
The single-program multiple-data (SPMD) paradigm is becoming the most diffuse way to program commerc...
In this paper, a new architecture of distributed embedded memory cores for SoC is proposed and an ef...
ISBN: 0769503713One way to improve reliability in parallel computers consists of adding supplementar...