The LogP model characterizes the performance of modern parallel machines with a small set of parameters: the communication latency (L), overhead (o), bandwidth (g), and the number of processors (P ). In this paper, we analyze four parallel sorting algorithms (bitonic, column, radix, and sample sort) under LogP. We develop implementations of these algorithms in a parallel extension to C and compare the actual performance on a CM-5 of 32 to 512 processors with that predicted by LogP using parameter values for this machine. Our experience was that the model served as a valuable guide throughout the development of the fast parallel sorts and revealed subtle defects in the implementations. The final observed performance matches closely with the ...
Abstract. We give a parallel implementation of merge sort on a CREW PRAM that uses n processors and ...
We present work-preserving emulations with small slowdown between LogP and two other parallel models...
AbstractWe present a simple deterministic parallel algorithm that runs on a CRCW PRAM and sorts n in...
Previous schemes for sorting on general-purpose parallel machines have had to choose between poor lo...
We present the design and implementation of a parallel out-of-core sorting algorithm, which is based...
Abstract. Sample sort, a generalization of quicksort that partitions the input into many pieces, is ...
Abstract. We study the problem of sorting on a parallel computer with limited communication bandwidt...
Sorting appears the most attention among all computational tasks over the past years because sorted ...
Background: Sorting algorithms are an essential part of computer science. With the use of parallelis...
The expanding use of multi-processor supercomputers has made a significant impact on the speed and s...
The study presents a comparative study of some sorting algorithm with the aim to come up with the mo...
In this paper, we propose a taxonomy of parallel sorting that includes a broad range of array and f...
We report the performance of NOW-Sort, a collection of sort-ing implementations on a Network of Work...
Parallel sorting techniques have become of practical interest with the advent of new multiprocessor ...
Previous schemes for sorting on general-purpose parallel machines have had to choose between poor lo...
Abstract. We give a parallel implementation of merge sort on a CREW PRAM that uses n processors and ...
We present work-preserving emulations with small slowdown between LogP and two other parallel models...
AbstractWe present a simple deterministic parallel algorithm that runs on a CRCW PRAM and sorts n in...
Previous schemes for sorting on general-purpose parallel machines have had to choose between poor lo...
We present the design and implementation of a parallel out-of-core sorting algorithm, which is based...
Abstract. Sample sort, a generalization of quicksort that partitions the input into many pieces, is ...
Abstract. We study the problem of sorting on a parallel computer with limited communication bandwidt...
Sorting appears the most attention among all computational tasks over the past years because sorted ...
Background: Sorting algorithms are an essential part of computer science. With the use of parallelis...
The expanding use of multi-processor supercomputers has made a significant impact on the speed and s...
The study presents a comparative study of some sorting algorithm with the aim to come up with the mo...
In this paper, we propose a taxonomy of parallel sorting that includes a broad range of array and f...
We report the performance of NOW-Sort, a collection of sort-ing implementations on a Network of Work...
Parallel sorting techniques have become of practical interest with the advent of new multiprocessor ...
Previous schemes for sorting on general-purpose parallel machines have had to choose between poor lo...
Abstract. We give a parallel implementation of merge sort on a CREW PRAM that uses n processors and ...
We present work-preserving emulations with small slowdown between LogP and two other parallel models...
AbstractWe present a simple deterministic parallel algorithm that runs on a CRCW PRAM and sorts n in...