When debugging a distributed system, it is sometimes necessary to explain the absence of an event – for instance, why a certain route is not available, or why a certain packet did not arrive. Existing debuggers offer some support for explaining the presence of events, usually by providing the equivalent of a backtrace in conventional debuggers, but they are not very good at answering “Why not?” questions: there is simply no starting point for a possible backtrace. In this paper, we show that the concept of negative provenance can be used to explain the absence of events in distributed systems. Negative provenance relies on counterfactual reasoning to identify the conditions under which the missing event could have occurred. We define a form...
In large-scale networks, many things can go wrong: routers can be misconfigured, programs can be bug...
The distributed systems research community has developed many provably correct algorithms and abstra...
Thesis (Ph.D.)--University of Washington, 2019Designing and debugging distributed systems is notorio...
When debugging a distributed system, it is sometimes necessary to explain the absence of an event – ...
Diagnosing and repairing problems in complex distributed systems has always been challenging. A wide...
Diagnosing and repairing problems in complex distributed systems has always been challenging. A wide...
In this paper, we propose a new approach to diagnosing prob-lems in complex networks. Our approach i...
Failures in computing systems are unavoidable. Therefore, it is important to detect and diagnose fai...
Distributed systems play a critical role in people\u27s daily lives. They provide functions such as ...
Many interesting large-scale systems are distributed systems of multiple communicating components. S...
Abstract: Formal methods for deciding the properties of service oriented systems are of paramount im...
The ability to reason about changes in a distributed system’s state enables network administrators t...
Distributed systems are ubiquitous but continue to be challenging to understand, build, and troubles...
Backward error recovery is one of the most used schemes to ensure fault-tolera- nce in distributed s...
In large-scale networks, many things can go wrong: routers can be misconfigured, programs can be bug...
In large-scale networks, many things can go wrong: routers can be misconfigured, programs can be bug...
The distributed systems research community has developed many provably correct algorithms and abstra...
Thesis (Ph.D.)--University of Washington, 2019Designing and debugging distributed systems is notorio...
When debugging a distributed system, it is sometimes necessary to explain the absence of an event – ...
Diagnosing and repairing problems in complex distributed systems has always been challenging. A wide...
Diagnosing and repairing problems in complex distributed systems has always been challenging. A wide...
In this paper, we propose a new approach to diagnosing prob-lems in complex networks. Our approach i...
Failures in computing systems are unavoidable. Therefore, it is important to detect and diagnose fai...
Distributed systems play a critical role in people\u27s daily lives. They provide functions such as ...
Many interesting large-scale systems are distributed systems of multiple communicating components. S...
Abstract: Formal methods for deciding the properties of service oriented systems are of paramount im...
The ability to reason about changes in a distributed system’s state enables network administrators t...
Distributed systems are ubiquitous but continue to be challenging to understand, build, and troubles...
Backward error recovery is one of the most used schemes to ensure fault-tolera- nce in distributed s...
In large-scale networks, many things can go wrong: routers can be misconfigured, programs can be bug...
In large-scale networks, many things can go wrong: routers can be misconfigured, programs can be bug...
The distributed systems research community has developed many provably correct algorithms and abstra...
Thesis (Ph.D.)--University of Washington, 2019Designing and debugging distributed systems is notorio...