One of the most challenging problems facing today's software engineer is to understand and modify distributed systems. One reason is that in actual use systems frequently behave differently than the developer intended. In order to cope with this challenge, we have developed a three-step method to study the run-time behavior of a distributed system. First, remote procedure calls are traced using CORBA interceptors. Next, the trace data is parsed to construct RPC call-return sequences, and summary statistics are generated. Finally, a visualization tool is used to study the statistics and look for anomalous behavior. We have been using this method on a large distributed system (more than 500000 lines of code) with data collected during both sy...
ABSTRACT: We propose a new class of profiler for distributed and heterogeneous systems. In these sys...
Abstract. We present a three-part approach for diagnosing bugs and performance problems in productio...
This work introduces a method for instrumenting applications. producing execution traces. and visual...
One of the most challenging problems facing today's software engineer is to understand and modify di...
One of the most challenging problems facing today's software engineer is to understand and modify di...
Testing a distributed system is difficult. Good testing depends on both skill and understanding the ...
Distributed computing systems are becoming more and more important in everyday life as well as in in...
Fay is a flexible platform for the efficient collection, processing, and analysis of software execut...
This dissertation proposes generalized techniques to support software performance analysis using sys...
ABSTRACT: Tracing allows the analysis of task interactions with each other and with the operating sy...
Distributed, real-time, and embedded (DRE) systems are becoming increasingly complex, and as a resul...
Stragglers, which are tasks that operate significantly slower than other tasks in a system, are a bi...
The causes of performance changes in a distributed system often elude even its developers. This pape...
Distributed systems are ubiquitous but continue to be challenging to understand, build, and troubles...
Understanding a large execution trace is not easy task due to the size and complexity of typical tra...
ABSTRACT: We propose a new class of profiler for distributed and heterogeneous systems. In these sys...
Abstract. We present a three-part approach for diagnosing bugs and performance problems in productio...
This work introduces a method for instrumenting applications. producing execution traces. and visual...
One of the most challenging problems facing today's software engineer is to understand and modify di...
One of the most challenging problems facing today's software engineer is to understand and modify di...
Testing a distributed system is difficult. Good testing depends on both skill and understanding the ...
Distributed computing systems are becoming more and more important in everyday life as well as in in...
Fay is a flexible platform for the efficient collection, processing, and analysis of software execut...
This dissertation proposes generalized techniques to support software performance analysis using sys...
ABSTRACT: Tracing allows the analysis of task interactions with each other and with the operating sy...
Distributed, real-time, and embedded (DRE) systems are becoming increasingly complex, and as a resul...
Stragglers, which are tasks that operate significantly slower than other tasks in a system, are a bi...
The causes of performance changes in a distributed system often elude even its developers. This pape...
Distributed systems are ubiquitous but continue to be challenging to understand, build, and troubles...
Understanding a large execution trace is not easy task due to the size and complexity of typical tra...
ABSTRACT: We propose a new class of profiler for distributed and heterogeneous systems. In these sys...
Abstract. We present a three-part approach for diagnosing bugs and performance problems in productio...
This work introduces a method for instrumenting applications. producing execution traces. and visual...