Branch divergence has a significant impact on the perfor-mance of GPU programs. We propose two novel software-based optimizations, called iteration delaying and branch distribution that aim to reduce branch divergence. Itera-tion delaying targets a divergent branch enclosed by a loop within a kernel. It improves performance by executing loop iterations that take the same branch direction and delay-ing those that take the other direction until later iterations. Branch distribution reduces the length of divergent code by factoring out structurally similar code from the branch paths. We conduct a preliminary evaluation of the two op-timizations using both synthetic benchmarks and a highly-optimized real-world application. Our evaluation shows ...
Graphics processing units (GPU), due to their massive computational power with up to thousands of co...
Abstract—Graphics processing units (GPU), due to their massive computational power with up to thousa...
Heterogeneous computing systems using one or more graphics processing units (GPUs) as accelerators p...
International audienceThread divergence optimization in GPU architectures have long been hindered by...
Graphic processing units (GPUs) are composed of a group of single-instruction multiple data (SIMD) s...
There has been a tremendous growth in the use of Graphics Processing Units (GPU) for the acceleratio...
International audienceIn this paper, we address the design and implementation of GPU-accelerated Bra...
Graphics processing units (GPUs) have recently evolved into popular accelerators for general-purpose...
Many applications with regular parallelism have been shown to benefit from using Graphics Processing...
Abstract—Control and memory divergence between threads within the same execution bundle, or warp, ha...
Branch divergence is a very commonly occurring performance problem in GPGPU in which the execution o...
We propose a generalized method for adapting and optimizing algorithms for efficient execution on mo...
International audienceThe increasing popularity of Graphics Processing Units (GPUs), has brought ren...
Biomedical application speed requirements have made general purpose graphics processing unit (GPU) a...
International audienceIn this paper,we propose a pioneering work on designing and programming B&B al...
Graphics processing units (GPU), due to their massive computational power with up to thousands of co...
Abstract—Graphics processing units (GPU), due to their massive computational power with up to thousa...
Heterogeneous computing systems using one or more graphics processing units (GPUs) as accelerators p...
International audienceThread divergence optimization in GPU architectures have long been hindered by...
Graphic processing units (GPUs) are composed of a group of single-instruction multiple data (SIMD) s...
There has been a tremendous growth in the use of Graphics Processing Units (GPU) for the acceleratio...
International audienceIn this paper, we address the design and implementation of GPU-accelerated Bra...
Graphics processing units (GPUs) have recently evolved into popular accelerators for general-purpose...
Many applications with regular parallelism have been shown to benefit from using Graphics Processing...
Abstract—Control and memory divergence between threads within the same execution bundle, or warp, ha...
Branch divergence is a very commonly occurring performance problem in GPGPU in which the execution o...
We propose a generalized method for adapting and optimizing algorithms for efficient execution on mo...
International audienceThe increasing popularity of Graphics Processing Units (GPUs), has brought ren...
Biomedical application speed requirements have made general purpose graphics processing unit (GPU) a...
International audienceIn this paper,we propose a pioneering work on designing and programming B&B al...
Graphics processing units (GPU), due to their massive computational power with up to thousands of co...
Abstract—Graphics processing units (GPU), due to their massive computational power with up to thousa...
Heterogeneous computing systems using one or more graphics processing units (GPUs) as accelerators p...