Abstract. In a distributed environment, several components collabo-rate with each other to cater a complex functionality. Adaptation in dis-tributed systems is one of the emerging trends that re-configures itself through components addition/removal/update, to cope up with faults. Components are generally inter-dependent, thus a fault propagates from one component to another. Existing root cause analysis techniques gener-ally create a static faults ’ dependencies graph to identify the root fault. However, these dependencies keep on changing with adaptations that makes design-time fault dependencies invalid at run-time. This paper de-scribes the problem of deriving causal relationships of faults in adaptive distributed systems. Then, presents...
A general framework for the design and analysis of distributed fault-tolerant systems is proposed in...
The increasing complexity of distributed enterprise systems has made the task of managing these syst...
Distributed software systems have become the backbone of Internet services. Failures in pro-duction ...
Failures in computing systems are unavoidable. Therefore, it is important to detect and diagnose fai...
This document describes the research performed on fault isolation in dynamic distributed systems at ...
This paper describes a method for automated analysis of fault-tolerance properties of distributed sy...
Abstract—Self-diagnosis is a fundamental capability of self-adaptive systems. In order to recover fr...
A monitoring approach to the problem of constructing fault-tolerant and adaptive real-time systems, ...
We describe a methodology for identifying and characterizing dynamic dependencies between system com...
. A rigorous, automated approach to analyzing fault-tolerance of distributed systems is presented. T...
. An adaptive computing system is one that modifies its behavior based on changes in the environment...
The distributed systems research community has developed many provably correct algorithms and abstra...
International audienceIn this paper we propose a methodology for the identification of the root caus...
Abstract. A method for automated analysis of fault-tolerance of distributed systems is presented. It...
A monitoring approach to the problem of constructing fault-tolerant and adaptive real-time systems, ...
A general framework for the design and analysis of distributed fault-tolerant systems is proposed in...
The increasing complexity of distributed enterprise systems has made the task of managing these syst...
Distributed software systems have become the backbone of Internet services. Failures in pro-duction ...
Failures in computing systems are unavoidable. Therefore, it is important to detect and diagnose fai...
This document describes the research performed on fault isolation in dynamic distributed systems at ...
This paper describes a method for automated analysis of fault-tolerance properties of distributed sy...
Abstract—Self-diagnosis is a fundamental capability of self-adaptive systems. In order to recover fr...
A monitoring approach to the problem of constructing fault-tolerant and adaptive real-time systems, ...
We describe a methodology for identifying and characterizing dynamic dependencies between system com...
. A rigorous, automated approach to analyzing fault-tolerance of distributed systems is presented. T...
. An adaptive computing system is one that modifies its behavior based on changes in the environment...
The distributed systems research community has developed many provably correct algorithms and abstra...
International audienceIn this paper we propose a methodology for the identification of the root caus...
Abstract. A method for automated analysis of fault-tolerance of distributed systems is presented. It...
A monitoring approach to the problem of constructing fault-tolerant and adaptive real-time systems, ...
A general framework for the design and analysis of distributed fault-tolerant systems is proposed in...
The increasing complexity of distributed enterprise systems has made the task of managing these syst...
Distributed software systems have become the backbone of Internet services. Failures in pro-duction ...