Previous schemes for sorting on general-purpose parallel machines have had to choose between poor load balancing and irregular communication or multiple rounds of all-to-all personalized communication. In this paper, we introduce a novel variation on sample sort which uses only two rounds of regular all-to-all personalized communication in a scheme that yields very good load balancing with virtually no overhead. This algorithm was implemented in Split-C and run on a variety of platforms, including the Thinking Machines CM-5, the IBM SP-2, and the Cray Research T3D. We ran our code using widely different benchmarks to examine the dependence of our algorithm on the input distribution. Our experimental results are consistent with the...
A new approach to parallel sorting called Parallel Sorting by OverPartitioning (PSOP) is presented. ...
Sorting is one of the most fundamental algorithmic kernels, used by a large fraction of computer app...
The Parallel Disks Model (PDM) has been proposed to alleviate the I/O bottleneck that arises in the ...
Previous schemes for sorting on general-purpose parallel machines have had to choose between poor lo...
Previous schemes for sorting on general-purpose parallel machines have had to choose between poor lo...
Previous schemes for sorting on general-purpose parallel machines have had to choose between poor lo...
We introduce a new deterministic parallel sorting algorithm based on the regular sampling approach...
We introduce a new deterministic parallel sorting algorithm for distributed memory machines based on...
A fundamental challenge for parallel computing is to obtain high-level, architecture independent, al...
A fundamental challenge for parallel computing is to obtain high-level, architecture independent, a...
Parallel sorting algorithms have been proposed for a variety of multiple instruction streams, multip...
We consider the often-studied problem of sorting, for a parallel computer. Given an input array dis...
Many sorting algorithms that perform well on uniformly distributed data suffer significant performan...
Clusters of symmetric multiprocessors (SMPs) have emerged as the primary candidates for large scale...
A fundamental challenge for parallel computing is to obtain high-level, architecture independent, al...
A new approach to parallel sorting called Parallel Sorting by OverPartitioning (PSOP) is presented. ...
Sorting is one of the most fundamental algorithmic kernels, used by a large fraction of computer app...
The Parallel Disks Model (PDM) has been proposed to alleviate the I/O bottleneck that arises in the ...
Previous schemes for sorting on general-purpose parallel machines have had to choose between poor lo...
Previous schemes for sorting on general-purpose parallel machines have had to choose between poor lo...
Previous schemes for sorting on general-purpose parallel machines have had to choose between poor lo...
We introduce a new deterministic parallel sorting algorithm based on the regular sampling approach...
We introduce a new deterministic parallel sorting algorithm for distributed memory machines based on...
A fundamental challenge for parallel computing is to obtain high-level, architecture independent, al...
A fundamental challenge for parallel computing is to obtain high-level, architecture independent, a...
Parallel sorting algorithms have been proposed for a variety of multiple instruction streams, multip...
We consider the often-studied problem of sorting, for a parallel computer. Given an input array dis...
Many sorting algorithms that perform well on uniformly distributed data suffer significant performan...
Clusters of symmetric multiprocessors (SMPs) have emerged as the primary candidates for large scale...
A fundamental challenge for parallel computing is to obtain high-level, architecture independent, al...
A new approach to parallel sorting called Parallel Sorting by OverPartitioning (PSOP) is presented. ...
Sorting is one of the most fundamental algorithmic kernels, used by a large fraction of computer app...
The Parallel Disks Model (PDM) has been proposed to alleviate the I/O bottleneck that arises in the ...