Heterogeneous computing systems using one or more graphics processing units (GPUs) as accelerators present unique load balancing challenges due to the architecture of the GPUs. Assigning a part of the workload proportional to the throughput of the GPU is unlikely to achieve the peak theoretical performance of the GPU, partly because of branch divergence. Additionally, for workloads depending on pseudo-random numbers, the branch divergence may appear unpredictable, making it hard to work around. In this thesis we present an approach for reorganizing pseudo-random workloads before execution on the GPU, with the goal of reducing the branch divergence. In our experiments, the method achieves a speedup in kernel execution time of up to 1.45 on ...
High compute-density with massive thread-level parallelism of Graphics Processing Units (GPUs) is be...
Heterogeneous parallel architectures like those comprised of CPUs and GPUs are a tantalizing compute...
GPUs (Graphics Processing Units) have become one of the main co-processors that contributed to deskt...
Widespread heterogeneous parallelism is unavoidable given the emergence of General-Purpose computing...
Scientific codes are usually highly parallelised and executed on heterogeneous architectures. Nowada...
As computing systems continue to increase in complexity, energy optimization plays a key role in the...
As computing systems continue to increase in complexity, energy optimization plays a key role in the...
Abstract—The use of GPU clusters for scientific applications in areas such as physics, chemistry and...
The Graphics Processing Unit (GPU) is present in almost every modern day personal computer. Despite...
We explore software mechanisms for managing irregular tasks on graphics processing units (GPUs). We ...
We propose a GPU fine-grained load-balancing abstraction that decouples load balancing from work pro...
Today's heterogeneous architectures bring together multiple general purpose CPUs, domain specific GP...
The computational power provided by many-core graph-ics processing units (GPUs) has been exploited i...
With the emergence of General Purpose computation on GPU (GPGPU) and corresponding programming fram...
AbstractRecently, heterogeneous system architectures are becoming mainstream for achieving high perf...
High compute-density with massive thread-level parallelism of Graphics Processing Units (GPUs) is be...
Heterogeneous parallel architectures like those comprised of CPUs and GPUs are a tantalizing compute...
GPUs (Graphics Processing Units) have become one of the main co-processors that contributed to deskt...
Widespread heterogeneous parallelism is unavoidable given the emergence of General-Purpose computing...
Scientific codes are usually highly parallelised and executed on heterogeneous architectures. Nowada...
As computing systems continue to increase in complexity, energy optimization plays a key role in the...
As computing systems continue to increase in complexity, energy optimization plays a key role in the...
Abstract—The use of GPU clusters for scientific applications in areas such as physics, chemistry and...
The Graphics Processing Unit (GPU) is present in almost every modern day personal computer. Despite...
We explore software mechanisms for managing irregular tasks on graphics processing units (GPUs). We ...
We propose a GPU fine-grained load-balancing abstraction that decouples load balancing from work pro...
Today's heterogeneous architectures bring together multiple general purpose CPUs, domain specific GP...
The computational power provided by many-core graph-ics processing units (GPUs) has been exploited i...
With the emergence of General Purpose computation on GPU (GPGPU) and corresponding programming fram...
AbstractRecently, heterogeneous system architectures are becoming mainstream for achieving high perf...
High compute-density with massive thread-level parallelism of Graphics Processing Units (GPUs) is be...
Heterogeneous parallel architectures like those comprised of CPUs and GPUs are a tantalizing compute...
GPUs (Graphics Processing Units) have become one of the main co-processors that contributed to deskt...