General purpose programming on the graphics processing units (GPGPU) has received a lot of attention in the parallel computing community as it promises to offer the highest per-formance per dollar. The GPUs have been used extensively on regular problems that can be easily parallelized. In this paper, we describe two implementations of List Ranking, a traditional irregular algorithm that is difficult to parallelize on such massively multi-threaded hardware. We first present an implementation of Wyllie’s algorithm based on pointer jumping. This technique does not scale well to large lists due to the suboptimal work done. We then present a GPU-optimized, Recursive Helman-JáJa ́ (RHJ) algorithm. Our RHJ implementation can rank a random list of...
An earlier parallel list ranking algorithm performs well for problem sizes $N$ that are extremely la...
AbstractIn this paper, we show how to employ Graphics Processing Units (GPUs) to provide an effcient...
In this paper, we show how to employ Graphics Processing Units (GPUs) to provide an effcient and hig...
This paper presents an algorithm for fast sorting of large lists using modern GPUs. The method achie...
AbstractAlthough parallel algorithms using linked lists, trees, and graphs have been studied extensi...
We present a number of optimization techniques to compute prefix sums on linked lists and implement ...
List ranking and list scan are two primitive operations used in many parallel algorithms that use li...
We present a number of optimization techniques to compute prefix sums on linked lists and implement ...
Abstract—We present analytical and experimental results for fine-grained list ranking algorithms. We...
Graphics Processing Units (GPUs) are a fast evolving architecture. Over the last decade their progra...
Parallel list ranking is a hard problem due to its extreme degree of irregularity. Also because of i...
Modern Graphics Processing Units (GPUs) provide high computation power at low costs and have been de...
Sorting algorithms have been studied extensively since past three decades. Their uses are found in m...
This paper presents a comparative analysis of the three widely used parallel sorting algorithms: Odd...
We study the relationship between memory accesses, bank conflicts, thread multiplicity (also known a...
An earlier parallel list ranking algorithm performs well for problem sizes $N$ that are extremely la...
AbstractIn this paper, we show how to employ Graphics Processing Units (GPUs) to provide an effcient...
In this paper, we show how to employ Graphics Processing Units (GPUs) to provide an effcient and hig...
This paper presents an algorithm for fast sorting of large lists using modern GPUs. The method achie...
AbstractAlthough parallel algorithms using linked lists, trees, and graphs have been studied extensi...
We present a number of optimization techniques to compute prefix sums on linked lists and implement ...
List ranking and list scan are two primitive operations used in many parallel algorithms that use li...
We present a number of optimization techniques to compute prefix sums on linked lists and implement ...
Abstract—We present analytical and experimental results for fine-grained list ranking algorithms. We...
Graphics Processing Units (GPUs) are a fast evolving architecture. Over the last decade their progra...
Parallel list ranking is a hard problem due to its extreme degree of irregularity. Also because of i...
Modern Graphics Processing Units (GPUs) provide high computation power at low costs and have been de...
Sorting algorithms have been studied extensively since past three decades. Their uses are found in m...
This paper presents a comparative analysis of the three widely used parallel sorting algorithms: Odd...
We study the relationship between memory accesses, bank conflicts, thread multiplicity (also known a...
An earlier parallel list ranking algorithm performs well for problem sizes $N$ that are extremely la...
AbstractIn this paper, we show how to employ Graphics Processing Units (GPUs) to provide an effcient...
In this paper, we show how to employ Graphics Processing Units (GPUs) to provide an effcient and hig...