The rise of graphics processing units in high-performance computing is bringing renewed interest in code optimization techniques that target SIMD processors. Many of these optimizations rely on divergence analyses, which classify variables as uniform, if they have the same value on every thread, or divergent, if they might not. This paper introduces a new kind of divergence analysis, that is able to represent variables as affine functions of thread identifiers. We have implemented our divergence analysis with affine constraints on top of Ocelot, an open source compiler, and use it to analyze a suite of 177 CUDA kernels from well-known benchmarks. These experiments show that our algorithm reports 4% less divergent variables than the previous...
Branch divergence is a very commonly occurring performance problem in GPGPU in which the execution o...
The energy costs of data movement are limiting the performance scaling of future generations of high...
Parallel architectures following the SIMT model such as GPUs benefit from application regularity by ...
The rise of graphics processing units in high-performance computing is bringing renewed interest in ...
The rising popularity of graphics processing units is bringing renewed interest in code optimization...
International audienceGrowing interest in graphics processing units has brought renewed attention to...
Graphic processing units (GPUs) are composed of a group of single-instruction multiple data (SIMD) s...
National audienceParallel architectures following the SIMT model such as GPUs benefit from applicati...
Abstract—Data-parallel architectures must provide efficient support for complex control-flow constru...
Parallel architectures following the SIMT model such as GPUs benefit from application regularity by ...
We propose a compiler analysis pass for programs expressed in the Single Program, Multiple Data (SPM...
International audienceThe increasing popularity of Graphics Processing Units (GPUs), has brought ren...
Vectorizing compilers employ divergence analysis to detect at which program point a specific variabl...
Graphics processing units (GPUs) have recently evolved into popular accelerators for general-purpose...
Branch divergence is a very commonly occurring performance problem in GPGPU in which the execution o...
The energy costs of data movement are limiting the performance scaling of future generations of high...
Parallel architectures following the SIMT model such as GPUs benefit from application regularity by ...
The rise of graphics processing units in high-performance computing is bringing renewed interest in ...
The rising popularity of graphics processing units is bringing renewed interest in code optimization...
International audienceGrowing interest in graphics processing units has brought renewed attention to...
Graphic processing units (GPUs) are composed of a group of single-instruction multiple data (SIMD) s...
National audienceParallel architectures following the SIMT model such as GPUs benefit from applicati...
Abstract—Data-parallel architectures must provide efficient support for complex control-flow constru...
Parallel architectures following the SIMT model such as GPUs benefit from application regularity by ...
We propose a compiler analysis pass for programs expressed in the Single Program, Multiple Data (SPM...
International audienceThe increasing popularity of Graphics Processing Units (GPUs), has brought ren...
Vectorizing compilers employ divergence analysis to detect at which program point a specific variabl...
Graphics processing units (GPUs) have recently evolved into popular accelerators for general-purpose...
Branch divergence is a very commonly occurring performance problem in GPGPU in which the execution o...
The energy costs of data movement are limiting the performance scaling of future generations of high...
Parallel architectures following the SIMT model such as GPUs benefit from application regularity by ...