This paper presents various aspects of reliability, availability and serviceability (RAS) systems as they relate to group communication service, including re-liable and total order multicast/broadcast, virtual synchrony, and failure detection. While the issue of availability, particularly high availability using replication-based architectures has recently received upsurge research interests, much still have to be done in understanding the basic underlying concepts for achieving RAS systems, especially in high-end and high performance computing (HPC) communities. Various attributes of group communication servic and the prototype of symmetric active replication follow-ing ideas utilized in the Newtop protocol will be dis
The author mainly concentrates on transactional distributed systems. Most previous research on repli...
In this paper we explore the use of group communication technology, developed in the Horus project t...
This paper presents our current work in characterizing the behavior of a real-time dependable distri...
This paper presents various aspects of reliability, availability and serviceability (RAS) systems as...
This paper presents various aspects of reliability, availability and serviceability (RAS) systems as...
A widely used computational model for constructing fault-tolerant distributed applications employs a...
In recent years, the study of distributed systems has become an increasingly important focus of comp...
This work describes the design and implementation details of a reliable group communication mechanis...
Our project is a multi-institutional research effort that adopts interplay of RELIABILITY, AVAILABIL...
Many fault-tolerant group communication middleware systems have been implemented assuming crash fail...
multi-year Cornell research program in process group communication used for fault-tolerance, securit...
We describe a collection of communication primitives integrated with a mechanism for handling proce...
PhD ThesisMany fault-tolerant group communication middleware systems have been implemented assuming ...
Group communication is the basic infrastructure for implementing fault-tolerant replicated servers. ...
Group communication is deployed in many evolving Internet-scale cooperative applications such as mul...
The author mainly concentrates on transactional distributed systems. Most previous research on repli...
In this paper we explore the use of group communication technology, developed in the Horus project t...
This paper presents our current work in characterizing the behavior of a real-time dependable distri...
This paper presents various aspects of reliability, availability and serviceability (RAS) systems as...
This paper presents various aspects of reliability, availability and serviceability (RAS) systems as...
A widely used computational model for constructing fault-tolerant distributed applications employs a...
In recent years, the study of distributed systems has become an increasingly important focus of comp...
This work describes the design and implementation details of a reliable group communication mechanis...
Our project is a multi-institutional research effort that adopts interplay of RELIABILITY, AVAILABIL...
Many fault-tolerant group communication middleware systems have been implemented assuming crash fail...
multi-year Cornell research program in process group communication used for fault-tolerance, securit...
We describe a collection of communication primitives integrated with a mechanism for handling proce...
PhD ThesisMany fault-tolerant group communication middleware systems have been implemented assuming ...
Group communication is the basic infrastructure for implementing fault-tolerant replicated servers. ...
Group communication is deployed in many evolving Internet-scale cooperative applications such as mul...
The author mainly concentrates on transactional distributed systems. Most previous research on repli...
In this paper we explore the use of group communication technology, developed in the Horus project t...
This paper presents our current work in characterizing the behavior of a real-time dependable distri...