In a network consisting of several thousands computers, the occurrence of faults is unavoid- able. Being able to test the behavior of a distributed program in an environment where we can control the faults (such as the crash of a process) is an important feature that matters in the deployment of reliable programs. In this paper, we extend FAIL-FCI (for Fault Injection Language, and FAIL Cluster Im- plementation, respectively), a software tool that permits to elaborate complex fault scenarios in a simple way, while relieving the user from writing low level code. In particular, we show that not only we are able to fault-load existing distributed applications (as used in most cur- rent papers that address fault-tolerance issues), we are also a...
With the rise of software complexity, software-related accidents represent a significant threat for ...
Fault Tolerance Mechanisms (FTMs) are extensively used in software systems to counteract software fa...
PhD ThesisOne way of gaining confidence in the adequacy of fault tolerance mechanisms of a system...
In a network consisting of several thousands computers, the occurrence of faults is unavoid- able. B...
International audienceIn a network consisting of several thousands computers, the occurrence of faul...
One of the topics of paramount importance in the development of Grid middleware is the impact of fau...
We present a case study on fault injection testing at the interface level between components of a di...
One of the topics of paramount importance in the development of Cluster and Grid middleware is the i...
The paper deals with the problem of testing computer system's susceptibility to hardware faults by m...
Dans un réseau constitué de plusieurs milliers d ordinateurs, l apparition de fautes est inévitable....
Software is being used for building applications requiring extreme dependability. In many cases, sys...
Fault injection is a pivotal technique in dependability benchmarking. Unfortunately, existing genera...
The analysis of fault injection experiments can be a cumbersome task. These experiments can generate...
As software for distributed systems becomes more complex, ensuring that a system meets its prescribe...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
With the rise of software complexity, software-related accidents represent a significant threat for ...
Fault Tolerance Mechanisms (FTMs) are extensively used in software systems to counteract software fa...
PhD ThesisOne way of gaining confidence in the adequacy of fault tolerance mechanisms of a system...
In a network consisting of several thousands computers, the occurrence of faults is unavoid- able. B...
International audienceIn a network consisting of several thousands computers, the occurrence of faul...
One of the topics of paramount importance in the development of Grid middleware is the impact of fau...
We present a case study on fault injection testing at the interface level between components of a di...
One of the topics of paramount importance in the development of Cluster and Grid middleware is the i...
The paper deals with the problem of testing computer system's susceptibility to hardware faults by m...
Dans un réseau constitué de plusieurs milliers d ordinateurs, l apparition de fautes est inévitable....
Software is being used for building applications requiring extreme dependability. In many cases, sys...
Fault injection is a pivotal technique in dependability benchmarking. Unfortunately, existing genera...
The analysis of fault injection experiments can be a cumbersome task. These experiments can generate...
As software for distributed systems becomes more complex, ensuring that a system meets its prescribe...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
With the rise of software complexity, software-related accidents represent a significant threat for ...
Fault Tolerance Mechanisms (FTMs) are extensively used in software systems to counteract software fa...
PhD ThesisOne way of gaining confidence in the adequacy of fault tolerance mechanisms of a system...