A common debugging strategy involves re-executing a program (on a given input) over and over, each time gaining more information about bugs. Such techniques can fail on message-passing parallel programs. Because of variations in message latencies and process scheduling, different runs on the given input may produce different results. This non-repeatability is a serious debugging problem, since,an execution cannot always be reproduced to track down bugs. This paper presents a technique for tracing and replaying message-passing pro-grams for debugging. Our technique is optimal in the common case,and has good performance in the worst case. By m,aking run-time tracing decisions, we trace only a fraction of the total number of messages, g,aining...
One of the major difficulties in debugging concurrent programs is that the programmer usually experi...
Concurrent programs are ubiquitous, from the high-end servers to personal machines, due to the fact ...
The ability to reproduce a parallel execution is desirable for debugging and program reliability pur...
Debugging requires execution replay. Locations of bugs are rarely known in advance, so an execution ...
Multicore is here to stay. To keep up with the hardware innovation, software developers mustmove fro...
Debugging concurrent programs is known to be difficult due to scheduling non-determinism. The techni...
Testing and debugging parallel programs is often difficult and tedious since concurrently executing ...
The debugging cycle is the most common methodology for finding and correcting errors in sequential p...
Clusters of shared-memory symmetric multiprocessors are increasingly used for high performance...
The debugging cycle is the most common methodology for finding and correcting errors in sequential p...
While a lot of work has been focused on design and programming of shared memory multi-core architect...
Significant time is spent by companies trying to reproduce and fix bugs. BugNet is a recent architec...
Part 1: Full PapersInternational audienceDebugging of concurrent systems is a tedious and error-pron...
Debugging is generally considered to be difficult. The increased complexity and non determinism of p...
The problems of debugging parallel programs have been known for quite some time. However, the litera...
One of the major difficulties in debugging concurrent programs is that the programmer usually experi...
Concurrent programs are ubiquitous, from the high-end servers to personal machines, due to the fact ...
The ability to reproduce a parallel execution is desirable for debugging and program reliability pur...
Debugging requires execution replay. Locations of bugs are rarely known in advance, so an execution ...
Multicore is here to stay. To keep up with the hardware innovation, software developers mustmove fro...
Debugging concurrent programs is known to be difficult due to scheduling non-determinism. The techni...
Testing and debugging parallel programs is often difficult and tedious since concurrently executing ...
The debugging cycle is the most common methodology for finding and correcting errors in sequential p...
Clusters of shared-memory symmetric multiprocessors are increasingly used for high performance...
The debugging cycle is the most common methodology for finding and correcting errors in sequential p...
While a lot of work has been focused on design and programming of shared memory multi-core architect...
Significant time is spent by companies trying to reproduce and fix bugs. BugNet is a recent architec...
Part 1: Full PapersInternational audienceDebugging of concurrent systems is a tedious and error-pron...
Debugging is generally considered to be difficult. The increased complexity and non determinism of p...
The problems of debugging parallel programs have been known for quite some time. However, the litera...
One of the major difficulties in debugging concurrent programs is that the programmer usually experi...
Concurrent programs are ubiquitous, from the high-end servers to personal machines, due to the fact ...
The ability to reproduce a parallel execution is desirable for debugging and program reliability pur...