The ability to reproduce a parallel execution is desirable for debugging and program reliability purposes. In debugging (13), the programmer needs to manually step back in time, while for resilience (6) this is automatically performed by the the application upon failure. To be useful, replay has to faithfully reproduce the original execution. For parallel programs the main challenge is inferring and maintaining the order of conflicting operations (data races). Deterministic record and replay (R&R) techniques have been developed for multithreaded shared memory programs (5), as well as distributed memory programs (14). Our main interest is techniques for large scale scientific (3; 4) programming models
Parallel profiling tools, such as ThreadScope for Parallel Haskell, allow programmers to obtain info...
Abstract. Alongside the rise of multiprocessor machines, the concurrent programming model has grown ...
Parallel profiling tools, such as ThreadScope for Parallel Haskell, allow programmers to obtain info...
In the area of debugging parallel executions, record and replay is a technique that allows determini...
Shared-memory parallel programs are inherently nondeterministic, making it difficult to diagnose rar...
The debugging cycle is the most common methodology for finding and correcting errors in sequential p...
The debugging cycle is the most common methodology for finding and correcting errors in sequential p...
Clusters of shared-memory symmetric multiprocessors are increasingly used for high performance...
Record and deterministic Replay (RnR) is a primitive with many proposed applications in computer sys...
This paper presents a tool that enables programmers to use dynamic testing tools for de-bugging non-...
This paper presents a taxonomy of parallel and distributed debuggers based on execution replay. Prog...
Debugging MIMD programs is often a delicate job. As a matter of fact, they can have different behavi...
Ability to replay a program’s execution on a multi-processor system can significantly help parallel ...
Alongside the rise of multi-processor machines, concurrent programming models have grown to near ubi...
Abstract. Parallel profiling tools, such as ThreadScope for Parallel Haskell, allow programmers to o...
Parallel profiling tools, such as ThreadScope for Parallel Haskell, allow programmers to obtain info...
Abstract. Alongside the rise of multiprocessor machines, the concurrent programming model has grown ...
Parallel profiling tools, such as ThreadScope for Parallel Haskell, allow programmers to obtain info...
In the area of debugging parallel executions, record and replay is a technique that allows determini...
Shared-memory parallel programs are inherently nondeterministic, making it difficult to diagnose rar...
The debugging cycle is the most common methodology for finding and correcting errors in sequential p...
The debugging cycle is the most common methodology for finding and correcting errors in sequential p...
Clusters of shared-memory symmetric multiprocessors are increasingly used for high performance...
Record and deterministic Replay (RnR) is a primitive with many proposed applications in computer sys...
This paper presents a tool that enables programmers to use dynamic testing tools for de-bugging non-...
This paper presents a taxonomy of parallel and distributed debuggers based on execution replay. Prog...
Debugging MIMD programs is often a delicate job. As a matter of fact, they can have different behavi...
Ability to replay a program’s execution on a multi-processor system can significantly help parallel ...
Alongside the rise of multi-processor machines, concurrent programming models have grown to near ubi...
Abstract. Parallel profiling tools, such as ThreadScope for Parallel Haskell, allow programmers to o...
Parallel profiling tools, such as ThreadScope for Parallel Haskell, allow programmers to obtain info...
Abstract. Alongside the rise of multiprocessor machines, the concurrent programming model has grown ...
Parallel profiling tools, such as ThreadScope for Parallel Haskell, allow programmers to obtain info...