An earlier parallel list ranking algorithm performs well for problem sizes $N$ that are extremely large in comparison to the number of PUs $P$. However, no existing algorithm gives good performance for reasonable loads. We present a novel family of algorithms, that achieve a better trade-off between the number of start-ups and the routing volume. We have implemented them on an Intel Paragon, and they turn out to considerably outperform all earlier algorithms: with $P = 2$ the sequential algorithm is already beaten for $N = \mbox{25,000}$; for $P = 100$ and $N = 10^7$, the speed-up is 21, and for $N = 10^8$ it even reaches 30. A modification of one of our algorithms solves a theoretical question: we show that on one-dimensional processor arr...
We developed analogous parallel algorithms to implement CostRank for distributed memory parallel com...
Abstract—We present analytical and experimental results for fine-grained list ranking algorithms. We...
AbstractWe present a parallel algorithm for the prefix sums problem which runs in timeO( logn/log lo...
An earlier parallel list ranking algorithm performs well for problem sizes $N$ that are extremely la...
Parallel list ranking is a hard problem due to its extreme degree of irregularity. Also because of i...
AbstractThe list-ranking problem is considered for parallel computers which communicate through an i...
AbstractAlthough parallel algorithms using linked lists, trees, and graphs have been studied extensi...
Two improved list-ranking algorithms are presented. The ``peeling-off'' algorithm leads to an optima...
Novel algorithms are presented for parallel and external memory list-ranking. The same algorithms ca...
List ranking and list scan are two primitive operations used in many parallel algorithms that use li...
The list-ranking problem is considered for parallel computers which communicate through an interconn...
We consider the problem of ranking an N element fist on a P processor EREW PRAM. Recent work on this...
We present a randomized parallel list ranking algorithm for distributed memory multiprocessors. A si...
We developed analogous parallel algorithms to implement CostRank for distributed memory parallel com...
General purpose programming on the graphics processing units (GPGPU) has received a lot of attention...
We developed analogous parallel algorithms to implement CostRank for distributed memory parallel com...
Abstract—We present analytical and experimental results for fine-grained list ranking algorithms. We...
AbstractWe present a parallel algorithm for the prefix sums problem which runs in timeO( logn/log lo...
An earlier parallel list ranking algorithm performs well for problem sizes $N$ that are extremely la...
Parallel list ranking is a hard problem due to its extreme degree of irregularity. Also because of i...
AbstractThe list-ranking problem is considered for parallel computers which communicate through an i...
AbstractAlthough parallel algorithms using linked lists, trees, and graphs have been studied extensi...
Two improved list-ranking algorithms are presented. The ``peeling-off'' algorithm leads to an optima...
Novel algorithms are presented for parallel and external memory list-ranking. The same algorithms ca...
List ranking and list scan are two primitive operations used in many parallel algorithms that use li...
The list-ranking problem is considered for parallel computers which communicate through an interconn...
We consider the problem of ranking an N element fist on a P processor EREW PRAM. Recent work on this...
We present a randomized parallel list ranking algorithm for distributed memory multiprocessors. A si...
We developed analogous parallel algorithms to implement CostRank for distributed memory parallel com...
General purpose programming on the graphics processing units (GPGPU) has received a lot of attention...
We developed analogous parallel algorithms to implement CostRank for distributed memory parallel com...
Abstract—We present analytical and experimental results for fine-grained list ranking algorithms. We...
AbstractWe present a parallel algorithm for the prefix sums problem which runs in timeO( logn/log lo...