Abstract. We explore the backtracking paradigm with properties seen as sub-optimal for GPU architectures, using as a case study the maximal clique enumeration problem, and find that the presence of these properties limit GPU performance to approximately 1.4--2.25 times a single CPU core. The GPU performance ''lessons'' we find critical to providing this performance include a coarse-and-fine-grain parallelization of the search space, a low-overhead load-balanced distribution of work, global memory latency hiding through coalescence, saturation, and shared memory utilization, and the use of GPU output buffering as a solution to irregular workloads and a large solution domain. We also find a strong reliance on an efficient global problem struc...
Performance analysis is a daunting job, especially for the rapid-evolving accelerator technologies. ...
Sparse problems arise from a variety of applications, from scientific simulations to graph analytics...
Graphics Processing Units (GPUs) are a fast evolving architecture. Over the last decade their progra...
Abstract. We explore the backtracking paradigm with properties seen as sub-optimal for GPU architect...
We present an iterative breadth-first approach to maximum clique enumeration on the GPU. The memory ...
International audienceNew GPGPU technologies, such as CUDA Dynamic Parallelism (CDP), can help deali...
We present an efficient model to analyze and improve the performance of general-purpose computation ...
Analytical performance models yield valuable architectural insight without incurring the excessive r...
The last few years has seen an explosion of effort in designing algorithms that harness the power of...
This is the artifacts of paper "Efficient Maximal Biclique Enumeration on GPUs", which will appear i...
GPUs are gaining fast adoption as high-performance computing architectures, mainly because of their ...
Many applications with regular parallelism have been shown to benefit from using Graphics Processing...
Computing systems today rely on massively parallel and heterogeneous architectures to promise very h...
We develop a microbenchmark-based performance model for NVIDIA GeForce 200-series GPUs. Our model id...
GPUs have become popular due to their high computational power. Data scientists rely on GPUs to proc...
Performance analysis is a daunting job, especially for the rapid-evolving accelerator technologies. ...
Sparse problems arise from a variety of applications, from scientific simulations to graph analytics...
Graphics Processing Units (GPUs) are a fast evolving architecture. Over the last decade their progra...
Abstract. We explore the backtracking paradigm with properties seen as sub-optimal for GPU architect...
We present an iterative breadth-first approach to maximum clique enumeration on the GPU. The memory ...
International audienceNew GPGPU technologies, such as CUDA Dynamic Parallelism (CDP), can help deali...
We present an efficient model to analyze and improve the performance of general-purpose computation ...
Analytical performance models yield valuable architectural insight without incurring the excessive r...
The last few years has seen an explosion of effort in designing algorithms that harness the power of...
This is the artifacts of paper "Efficient Maximal Biclique Enumeration on GPUs", which will appear i...
GPUs are gaining fast adoption as high-performance computing architectures, mainly because of their ...
Many applications with regular parallelism have been shown to benefit from using Graphics Processing...
Computing systems today rely on massively parallel and heterogeneous architectures to promise very h...
We develop a microbenchmark-based performance model for NVIDIA GeForce 200-series GPUs. Our model id...
GPUs have become popular due to their high computational power. Data scientists rely on GPUs to proc...
Performance analysis is a daunting job, especially for the rapid-evolving accelerator technologies. ...
Sparse problems arise from a variety of applications, from scientific simulations to graph analytics...
Graphics Processing Units (GPUs) are a fast evolving architecture. Over the last decade their progra...