Abstract Communications overhead is one of the most important factors affecting per-fonnance in message-passing multicomputers. We present evidence that there exists communications locality, and that this locality is "structured". We propose a number of heuristics that can be used to "predict " the target of subsequent com-munication requests. Communication latency is hidden through reconfiguring the network concurrently to the computation. Quantitative results obtained from standard parallel benchmarks run on IBM SP systems are also presented
Distributed shared-memory systems provide scalable performance and a convenient model for parallel p...
A benchmark test using the Message Passing Interface (MPI, an emerging standard for writing message ...
International audienceTo amortize the cost of MPI communications, distributed parallel HPC applicati...
Effective overlap of computation and communication is a well understood technique for latency hiding...
{An application package which allows the user to explore the possibility of hiding communication lat...
Contact chongcsucdavisedu The goal of this paper is to gain insight into the relative performance o...
The goal of this paper is to gain insight into the relative performance of communication mechanisms ...
The goal of this paper is to gain insight into the relative performance of communication mechanisms ...
In many parallel applications, network latency causes a dramatic loss in processor utilization...
This paper describes the design and implementation of mechanisms for latency tolerance in the remote...
For parallel computers, the execution time of communication routines is an important determinate of ...
In this thesis, we studied the behavior of parallel programs to understand how to automated the task...
This work provides a systematic study of the impact of commu-nication performance on parallel applic...
For parallel computers, the execution time of communication routines is an important determinate of ...
This work provides a systematic study of the impact of commu-nication performance on parallel applic...
Distributed shared-memory systems provide scalable performance and a convenient model for parallel p...
A benchmark test using the Message Passing Interface (MPI, an emerging standard for writing message ...
International audienceTo amortize the cost of MPI communications, distributed parallel HPC applicati...
Effective overlap of computation and communication is a well understood technique for latency hiding...
{An application package which allows the user to explore the possibility of hiding communication lat...
Contact chongcsucdavisedu The goal of this paper is to gain insight into the relative performance o...
The goal of this paper is to gain insight into the relative performance of communication mechanisms ...
The goal of this paper is to gain insight into the relative performance of communication mechanisms ...
In many parallel applications, network latency causes a dramatic loss in processor utilization...
This paper describes the design and implementation of mechanisms for latency tolerance in the remote...
For parallel computers, the execution time of communication routines is an important determinate of ...
In this thesis, we studied the behavior of parallel programs to understand how to automated the task...
This work provides a systematic study of the impact of commu-nication performance on parallel applic...
For parallel computers, the execution time of communication routines is an important determinate of ...
This work provides a systematic study of the impact of commu-nication performance on parallel applic...
Distributed shared-memory systems provide scalable performance and a convenient model for parallel p...
A benchmark test using the Message Passing Interface (MPI, an emerging standard for writing message ...
International audienceTo amortize the cost of MPI communications, distributed parallel HPC applicati...