One of the most challenging problems facing today's software engineer is to understand and modify distributed systems. One reason is that in actual use systems frequently behave differently than the designer intended. We describe a three-step method to allow a developer to understand the run-time behavior of a distributed system. First, remote procedure calls are traced using CORBA interceptors. Next, the trace data is parsed to construct RPC call-return sequences, and summary statistics are generated. Finally, a visualization tool is used to study the statistics and look for anomalous behavior. We are testing this method on a large distributed system (more than 600,000 lines of code) during operation at a customer's site. Despite the fact ...
Large scale distributed systems are composed of many thou-sands of computing units. Today’s examples...
International audienceLarge scale distributed systems are composed of many thousands of computing un...
This work introduces a method for instrumenting applications. producing execution traces. and visual...
One of the most challenging problems facing today's software engineer is to understand and modify di...
One of the most challenging problems facing today's software engineer is to understand and modify di...
Fay is a flexible platform for the efficient collection, processing, and analysis of software execut...
Testing a distributed system is difficult. Good testing depends on both skill and understanding the ...
Distributed computing systems are becoming more and more important in everyday life as well as in in...
This dissertation proposes generalized techniques to support software performance analysis using sys...
Distributed systems are ubiquitous but continue to be challenging to understand, build, and troubles...
Abstract. We present a three-part approach for diagnosing bugs and performance problems in productio...
Stragglers, which are tasks that operate significantly slower than other tasks in a system, are a bi...
ABSTRACT: We propose a new class of profiler for distributed and heterogeneous systems. In these sys...
ABSTRACT: Tracing allows the analysis of task interactions with each other and with the operating sy...
Understanding a large execution trace is not easy task due to the size and complexity of typical tra...
Large scale distributed systems are composed of many thou-sands of computing units. Today’s examples...
International audienceLarge scale distributed systems are composed of many thousands of computing un...
This work introduces a method for instrumenting applications. producing execution traces. and visual...
One of the most challenging problems facing today's software engineer is to understand and modify di...
One of the most challenging problems facing today's software engineer is to understand and modify di...
Fay is a flexible platform for the efficient collection, processing, and analysis of software execut...
Testing a distributed system is difficult. Good testing depends on both skill and understanding the ...
Distributed computing systems are becoming more and more important in everyday life as well as in in...
This dissertation proposes generalized techniques to support software performance analysis using sys...
Distributed systems are ubiquitous but continue to be challenging to understand, build, and troubles...
Abstract. We present a three-part approach for diagnosing bugs and performance problems in productio...
Stragglers, which are tasks that operate significantly slower than other tasks in a system, are a bi...
ABSTRACT: We propose a new class of profiler for distributed and heterogeneous systems. In these sys...
ABSTRACT: Tracing allows the analysis of task interactions with each other and with the operating sy...
Understanding a large execution trace is not easy task due to the size and complexity of typical tra...
Large scale distributed systems are composed of many thou-sands of computing units. Today’s examples...
International audienceLarge scale distributed systems are composed of many thousands of computing un...
This work introduces a method for instrumenting applications. producing execution traces. and visual...