When confronted with a buggy execution of a distributed system—which are commonplacefor distributed systems software—understanding what went wrong requiressignificant expertise, time, and luck. As the first step towards fixing the underlying bug,software developers typically start debugging by manually separating out events that areresponsible for triggering the bug (signal) from those that are extraneous (noise).In this thesis, we investigate whether it is possible to automate this separation process.Our aim is to reduce time and effort spent on troubleshooting, and we do so byeliminating events from buggy executions that are not causally related to the bug, ideallyproducing a “minimal causal sequence” (MCS) of triggering events.We show th...
In debugging distributed programs a distinction is made between an observed error and the program fa...
This paper describes parts of the design of a debugger for a distributed real-time multimedia system...
Software engineers have to face many problems when creating, testing and debugging their application...
Thesis (Ph.D.)--University of Washington, 2019Designing and debugging distributed systems is notorio...
Distributed systems are ubiquitous but continue to be challenging to understand, build, and troubles...
Debugging distributed programs is considerably more difficult than debugging sequential programs. We...
Failures in computing systems are unavoidable. Therefore, it is important to detect and diagnose fai...
I present a general framework for observing and controlling a distributed computation and its applic...
We propose a low-overhead sampling infrastructure for gathering information from the executions expe...
Developing correct and efficient software for large scale systems is a challenging task. Developers ...
Debugging concurrent programs is known to be difficult due to scheduling non-determinism. The techni...
Debugging is a tedious and time-consuming process for software developers. Therefore, providing effe...
Today's largest systems have over 100,000 cores, with million-core systems expected over the next fe...
Debugging real systems is hard, requires deep knowledge of the code, and is time-consuming. Bug repo...
Debugging of distributed software is approached in this paper by defining specific classes of progra...
In debugging distributed programs a distinction is made between an observed error and the program fa...
This paper describes parts of the design of a debugger for a distributed real-time multimedia system...
Software engineers have to face many problems when creating, testing and debugging their application...
Thesis (Ph.D.)--University of Washington, 2019Designing and debugging distributed systems is notorio...
Distributed systems are ubiquitous but continue to be challenging to understand, build, and troubles...
Debugging distributed programs is considerably more difficult than debugging sequential programs. We...
Failures in computing systems are unavoidable. Therefore, it is important to detect and diagnose fai...
I present a general framework for observing and controlling a distributed computation and its applic...
We propose a low-overhead sampling infrastructure for gathering information from the executions expe...
Developing correct and efficient software for large scale systems is a challenging task. Developers ...
Debugging concurrent programs is known to be difficult due to scheduling non-determinism. The techni...
Debugging is a tedious and time-consuming process for software developers. Therefore, providing effe...
Today's largest systems have over 100,000 cores, with million-core systems expected over the next fe...
Debugging real systems is hard, requires deep knowledge of the code, and is time-consuming. Bug repo...
Debugging of distributed software is approached in this paper by defining specific classes of progra...
In debugging distributed programs a distinction is made between an observed error and the program fa...
This paper describes parts of the design of a debugger for a distributed real-time multimedia system...
Software engineers have to face many problems when creating, testing and debugging their application...