Designed algorithms that are useful for developing protocols and supporting tools for fault tolerance, dynamic load balancing, and distributing monitoring in loosely coupled multi-processor systems. Four efficient algorithms are developed to learn network topology and reconfigure distributed application programs in execution using the available tools for replication and process migration. The first algorithm provides techniques for transparent software reconfiguration based on process migration in the context of quadtree embeddings in Hypercubes. Our novel approach provides efficient reconfiguration for some classes of faults that may be identified easily. We provide a theoretical characterization to use graph matching, quadratic assignment...
Fault-tolerant distributed algorithms play an important role in many critical/high-availability appl...
We consider issues of fault tolerance for distributed computing systems at two levels of system desi...
In this paper, we present load redistribution algorithms for hypercubes in the presence of faults. O...
As the sizes of distributed memory multiprocessors increase, the likelihood of a fault removing one ...
textThis dissertation presents techniques for detecting and tolerating faults in distributed systems...
We design two programs that maintain the nodes of any distributed system in a rooted spanning tree a...
There exist at least two models of parallel computing, namely, shared-memory and message-passing. Th...
PhD ThesisThis thesis describes the design and development of algorithms for fault tolerant distr...
textDistributed systems are rapidly increasing in importance due to the need for scalable computatio...
Fault tolerance in distributed shared memory through replication has yet to be explored. This resear...
Various aspects of reliable computing are formalized and quantified with emphasis on efficient fault...
This thesis presents a general theory for designing multiprocessor computer systems that can tolerat...
Networking involves every aspect in the design of the network infrastructure from the selection/synt...
PhD ThesisDesigning and implementing distributed systems which continue to provide specified service...
A general framework for the design and analysis of distributed fault-tolerant systems is proposed in...
Fault-tolerant distributed algorithms play an important role in many critical/high-availability appl...
We consider issues of fault tolerance for distributed computing systems at two levels of system desi...
In this paper, we present load redistribution algorithms for hypercubes in the presence of faults. O...
As the sizes of distributed memory multiprocessors increase, the likelihood of a fault removing one ...
textThis dissertation presents techniques for detecting and tolerating faults in distributed systems...
We design two programs that maintain the nodes of any distributed system in a rooted spanning tree a...
There exist at least two models of parallel computing, namely, shared-memory and message-passing. Th...
PhD ThesisThis thesis describes the design and development of algorithms for fault tolerant distr...
textDistributed systems are rapidly increasing in importance due to the need for scalable computatio...
Fault tolerance in distributed shared memory through replication has yet to be explored. This resear...
Various aspects of reliable computing are formalized and quantified with emphasis on efficient fault...
This thesis presents a general theory for designing multiprocessor computer systems that can tolerat...
Networking involves every aspect in the design of the network infrastructure from the selection/synt...
PhD ThesisDesigning and implementing distributed systems which continue to provide specified service...
A general framework for the design and analysis of distributed fault-tolerant systems is proposed in...
Fault-tolerant distributed algorithms play an important role in many critical/high-availability appl...
We consider issues of fault tolerance for distributed computing systems at two levels of system desi...
In this paper, we present load redistribution algorithms for hypercubes in the presence of faults. O...