Abstract. Autotuning is an established technique for adjusting perfor-mance-critical parameters of applications to their specific run-time envi-ronment. In this paper, we investigate the potential of online autotuning for general purpose computation on GPUs. Our application-independent autotuner AtuneRT optimizes GPU-specific parameters such as block size and loop-unrolling degree. We also discuss the peculiarities of auto-tuning on GPUs. We demonstrate tuning potential using CUDA and by instrumenting the parallel algorithms library Thrust. We evaluate our online autotuning approach with various GPUs and sample applications
The unprecedented prevalence of GPGPU is largely attributed to its abundant on-chip register resourc...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
GPUs, with their high bandwidths and computational capabilities are an increasingly popular target f...
A large amount of resources is spent writing, port-ing, and optimizing scientific and industrial Hig...
Optimal performance is an important goal in compute intensive applications. For GPU applications, th...
We have developed several autotuning benchmarks in CUDA that take into account performance-relevant ...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
We present a novel strategy for automatic performance tuning of GPU programs. The strategy combines ...
Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)...
GPUs have been used for years in compute intensive applications. Their massive parallel processing c...
Graphics Processing Units (GPUs) have revolutionized the computing landscape over the past decades. ...
2012-05-02Graphics Processing Units (GPUs) have evolved to devices with teraflop-level performance p...
In this paper, we present our implementation of an Auto tuning system, written in C++, which incorpo...
Autotuning is an established technique for optimizing the performance of parallel applications. Howe...
The unprecedented prevalence of GPGPU is largely attributed to its abundant on-chip register resourc...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
GPUs, with their high bandwidths and computational capabilities are an increasingly popular target f...
A large amount of resources is spent writing, port-ing, and optimizing scientific and industrial Hig...
Optimal performance is an important goal in compute intensive applications. For GPU applications, th...
We have developed several autotuning benchmarks in CUDA that take into account performance-relevant ...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
We present a novel strategy for automatic performance tuning of GPU programs. The strategy combines ...
Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)...
GPUs have been used for years in compute intensive applications. Their massive parallel processing c...
Graphics Processing Units (GPUs) have revolutionized the computing landscape over the past decades. ...
2012-05-02Graphics Processing Units (GPUs) have evolved to devices with teraflop-level performance p...
In this paper, we present our implementation of an Auto tuning system, written in C++, which incorpo...
Autotuning is an established technique for optimizing the performance of parallel applications. Howe...
The unprecedented prevalence of GPGPU is largely attributed to its abundant on-chip register resourc...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
GPUs, with their high bandwidths and computational capabilities are an increasingly popular target f...