Abstract. Today’s distributed systems need runtime error detection to catch errors arising from software bugs, hardware errors, or unexpected operating conditions. A prominent class of error detection techniques operates in a stateful manner, i.e., it keeps track of the state of the application being monitored and then matches state-based rules. Large-scale distributed applications generate a high volume of messages that can overwhelm the capacity of a stateful detection system. An existing approach to handle this is to randomly sample the messages and process the subset. However, this approach, leads to non-determinism with respect to the detection system’s view of what state the application is in. This in turn leads to degradation in the ...
As semiconductor technology scales into the deep submicron regime the occurrence of transient or sof...
International audienceThis article proposes an approach for the online analysis of accidental faults...
Failures in computing systems are unavoidable. Therefore, it is important to detect and diagnose fai...
With the increasing speed of computers, complexity of applications and large scale of applications, ...
Distributed systems comprising interacting services need runtime error detection to catch errors ari...
Distributed systems form an integral part of human life—from ATMs to the Domain Name Service. Typica...
As today\u27s distributed applications increase in complexity, it becomes increasingly difficult to ...
A dependable software system must contain two dependability components: (i) error detection mechanis...
Runtime verification has primarily been developed and evaluated as a means of enriching the software...
Abstract — It is a challenge to provide detection facilities for large scale distributed systems run...
required to diagnose the failure, i.e., to identify the source of the failure. Diagnosis is challeng...
In distributed systems, if a hardware fault corrupts the state of a process, this error might propag...
Computing systems are vulnerable to anomalies that might occur during execution of deployed software...
For dependability outages in distributed internet infrastructures, it is often not enough to detect ...
AbstractWe explore the use of distributed processing to enhance the performance of explicit state en...
As semiconductor technology scales into the deep submicron regime the occurrence of transient or sof...
International audienceThis article proposes an approach for the online analysis of accidental faults...
Failures in computing systems are unavoidable. Therefore, it is important to detect and diagnose fai...
With the increasing speed of computers, complexity of applications and large scale of applications, ...
Distributed systems comprising interacting services need runtime error detection to catch errors ari...
Distributed systems form an integral part of human life—from ATMs to the Domain Name Service. Typica...
As today\u27s distributed applications increase in complexity, it becomes increasingly difficult to ...
A dependable software system must contain two dependability components: (i) error detection mechanis...
Runtime verification has primarily been developed and evaluated as a means of enriching the software...
Abstract — It is a challenge to provide detection facilities for large scale distributed systems run...
required to diagnose the failure, i.e., to identify the source of the failure. Diagnosis is challeng...
In distributed systems, if a hardware fault corrupts the state of a process, this error might propag...
Computing systems are vulnerable to anomalies that might occur during execution of deployed software...
For dependability outages in distributed internet infrastructures, it is often not enough to detect ...
AbstractWe explore the use of distributed processing to enhance the performance of explicit state en...
As semiconductor technology scales into the deep submicron regime the occurrence of transient or sof...
International audienceThis article proposes an approach for the online analysis of accidental faults...
Failures in computing systems are unavoidable. Therefore, it is important to detect and diagnose fai...