A fundamental challenge for parallel computing is to obtain high-level, architecture independent, algorithms which efficiently execute on general-purpose parallel machines. With the emergence of message passing standards such as MPI, it has become easier to design efficient and portable parallel algorithms by making use of these communication primitives. While existing primitives allow an assortment of collective communication routines, they do not handle an important communication event when most or all processors have non-uniformly sized personalized messages to exchange with each other. We focus in this paper on the h-relation personalized communication whose efficient implementation will allow high performance implementations...
We present an optimal algorithm for sorting n integers in the range [1, nc ] (for any constant c) fo...
Cataloged from PDF version of article.A parallel sorting algorithm for sorting n elements evenly di...
AbstractWe study the effect of limited communication throughput on parallel computation in a setting...
A fundamental challenge for parallel computing is to obtain high-level, architecture independent, al...
A fundamental challenge for parallel computing is to obtain high-level, architecture independent, al...
Previous schemes for sorting on general-purpose parallel machines have had to choose between poor l...
Previous schemes for sorting on general-purpose parallel machines have had to choose between poor lo...
We introduce a new deterministic parallel sorting algorithm based on the regular sampling approach...
Previous schemes for sorting on general-purpose parallel machines have had to choose between poor lo...
In this paper we present several algorithms for performing all-to-many personalized communication on...
This paper presents solutions for the problem of many-to-many personalized communication, with bound...
Technical ReportWe introduce a new deterministic parallel sorting algorithm for distributed memory m...
Clusters of symmetric multiprocessors (SMPs) have emerged as the primary candidates for large scale...
This paper presents algorithms for implementing the transportation primitive on a distributed memory...
Integer sorting is a subclass of the sorting problem where the elements have integer values and the ...
We present an optimal algorithm for sorting n integers in the range [1, nc ] (for any constant c) fo...
Cataloged from PDF version of article.A parallel sorting algorithm for sorting n elements evenly di...
AbstractWe study the effect of limited communication throughput on parallel computation in a setting...
A fundamental challenge for parallel computing is to obtain high-level, architecture independent, al...
A fundamental challenge for parallel computing is to obtain high-level, architecture independent, al...
Previous schemes for sorting on general-purpose parallel machines have had to choose between poor l...
Previous schemes for sorting on general-purpose parallel machines have had to choose between poor lo...
We introduce a new deterministic parallel sorting algorithm based on the regular sampling approach...
Previous schemes for sorting on general-purpose parallel machines have had to choose between poor lo...
In this paper we present several algorithms for performing all-to-many personalized communication on...
This paper presents solutions for the problem of many-to-many personalized communication, with bound...
Technical ReportWe introduce a new deterministic parallel sorting algorithm for distributed memory m...
Clusters of symmetric multiprocessors (SMPs) have emerged as the primary candidates for large scale...
This paper presents algorithms for implementing the transportation primitive on a distributed memory...
Integer sorting is a subclass of the sorting problem where the elements have integer values and the ...
We present an optimal algorithm for sorting n integers in the range [1, nc ] (for any constant c) fo...
Cataloged from PDF version of article.A parallel sorting algorithm for sorting n elements evenly di...
AbstractWe study the effect of limited communication throughput on parallel computation in a setting...