Debugging and profiling large scale distributed applications is a daunting task. We present Friday, a system for debugging distributed applications that combines deterministic replay of components with the power of symbolic, low-level debugging and a simple language for expressing higher-level distributed conditions and actions. Friday allows the programmer to understand the collective state and dynamics of a distributed collection of coordinated application components, as part of the debugging process. To evaluate Friday, we consider several distributed problems, including routing consistency in overlay networks, and temporal state abnormalities caused by route flaps. We show via microbenchmarks and larger scale, application measurement th...
This thesis addresses the problem of debugging a distributed system. We define debugging as the proc...
This thesis addresses the problem of debugging a distributed system. We define debugging as the proc...
Observation of global properties of a distributed program is required in many applications such as d...
: This paper presents a practical paradigm, called on-the-fly replay. This paradigm consists of runn...
Software engineers have to face many problems when creating, testing and debugging their application...
Large-scale networks are among the most complex software infrastructures in existence. Unfortunately...
Debugging distributed systems is difficult. Most of the techniques that have been developed for debu...
Large-scale networks are among the most complex soft-ware infrastructures in existence. Unfortunatel...
I present a general framework for observing and controlling a distributed computation and its applic...
Large-scale networks are among the most complex software infrastructures in existence. Unfortunatel...
Thesis (Ph.D.)--University of Washington, 2019Designing and debugging distributed systems is notorio...
Debugging distributed programs is considerably more difficult than debugging sequential programs. We...
I present a general framework for observing and controlling a distributed computation and its applic...
This paper presents a taxonomy of parallel and distributed debuggers based on execution replay. Prog...
Clusters of shared-memory symmetric multiprocessors are increasingly used for high performance...
This thesis addresses the problem of debugging a distributed system. We define debugging as the proc...
This thesis addresses the problem of debugging a distributed system. We define debugging as the proc...
Observation of global properties of a distributed program is required in many applications such as d...
: This paper presents a practical paradigm, called on-the-fly replay. This paradigm consists of runn...
Software engineers have to face many problems when creating, testing and debugging their application...
Large-scale networks are among the most complex software infrastructures in existence. Unfortunately...
Debugging distributed systems is difficult. Most of the techniques that have been developed for debu...
Large-scale networks are among the most complex soft-ware infrastructures in existence. Unfortunatel...
I present a general framework for observing and controlling a distributed computation and its applic...
Large-scale networks are among the most complex software infrastructures in existence. Unfortunatel...
Thesis (Ph.D.)--University of Washington, 2019Designing and debugging distributed systems is notorio...
Debugging distributed programs is considerably more difficult than debugging sequential programs. We...
I present a general framework for observing and controlling a distributed computation and its applic...
This paper presents a taxonomy of parallel and distributed debuggers based on execution replay. Prog...
Clusters of shared-memory symmetric multiprocessors are increasingly used for high performance...
This thesis addresses the problem of debugging a distributed system. We define debugging as the proc...
This thesis addresses the problem of debugging a distributed system. We define debugging as the proc...
Observation of global properties of a distributed program is required in many applications such as d...