While a lot of work has been focused on design and programming of shared memory multi-core architectures, message passing architectures are increasingly being con-sidered an attractive design point for many-core [10] and application-specific [2] processors. A big concern with message passing architectures, however, is programmabil-ity and debuggability on such machines and the signifi-cant overhead of providing support for the same at software level. In this paper, we take a first look at providing hard-ware support for debugging and replay of message passing programs on message passing architectures. We propose a hardware framework for logging races between messages to allow deterministic replay of message passing programs. One implementat...
With the arrival of multicore chips as the commodity architecture for a wide range of platforms, th...
With the arrival of multicore chips as the commodity architecture for a wide range of platforms, the...
To support incremental replay of message-passing applications. processes must periodically checkpoin...
A common debugging strategy involves re-executing a program (on a given input) over and over, each t...
Significant time is spent by companies trying to reproduce and fix bugs. BugNet is a recent architec...
Clusters of shared-memory symmetric multiprocessors are increasingly used for high performance...
Ability to replay a program’s execution on a multi-processor system can significantly help parallel ...
Record and deterministic Replay (RnR) is a primitive with many proposed applications in computer sys...
To support incremental replay of message-passing applications, processes must periodically checkpoin...
The processor industry is at an inflection point. In the past, performance was the driving force beh...
In this paper we present an execution replay system for Athapascan, an MPI-based multi-threaded runt...
To support incremental replay of message-passing applications, processes must periodically checkpoin...
Debugging MIMD programs is often a delicate job. As a matter of fact, they can have different behavi...
Recent research in deterministic record-replayseeks to ease debugging, security, and fault tolerance...
Shared-memory parallel programs are inherently nondeterministic, making it difficult to diagnose rar...
With the arrival of multicore chips as the commodity architecture for a wide range of platforms, th...
With the arrival of multicore chips as the commodity architecture for a wide range of platforms, the...
To support incremental replay of message-passing applications. processes must periodically checkpoin...
A common debugging strategy involves re-executing a program (on a given input) over and over, each t...
Significant time is spent by companies trying to reproduce and fix bugs. BugNet is a recent architec...
Clusters of shared-memory symmetric multiprocessors are increasingly used for high performance...
Ability to replay a program’s execution on a multi-processor system can significantly help parallel ...
Record and deterministic Replay (RnR) is a primitive with many proposed applications in computer sys...
To support incremental replay of message-passing applications, processes must periodically checkpoin...
The processor industry is at an inflection point. In the past, performance was the driving force beh...
In this paper we present an execution replay system for Athapascan, an MPI-based multi-threaded runt...
To support incremental replay of message-passing applications, processes must periodically checkpoin...
Debugging MIMD programs is often a delicate job. As a matter of fact, they can have different behavi...
Recent research in deterministic record-replayseeks to ease debugging, security, and fault tolerance...
Shared-memory parallel programs are inherently nondeterministic, making it difficult to diagnose rar...
With the arrival of multicore chips as the commodity architecture for a wide range of platforms, th...
With the arrival of multicore chips as the commodity architecture for a wide range of platforms, the...
To support incremental replay of message-passing applications. processes must periodically checkpoin...