Recent years have witnessed phenomenal growth in the application, and capabilities of Graphical Processing Units (GPUs) due to their high parallel computation power at relatively low cost. However, writing a computationally efficient GPU program (kernel) is challenging, and generally only certain specific kernel configurations lead to significant increases in performance. Auto-tuning is the process of automatically optimizing software for highly-efficient execution on a target hardware platform. Auto-tuning is particularly useful for GPU programming, as a single kernel requires re-tuning after code changes, for different input data, and for different architectures. However, the discrete, and non-convex nature of the search space creates a c...
Graphics Processing Units (GPUs) have revolutionized the computing landscape over the past decade. H...
The focus of this work is the automatic performance tuning of stencil computations on Graphics Proce...
Finding optimal parameter configurations for tunable GPU kernels is a non-Trivial exercise for large...
Recent years have witnessed phenomenal growth in the application, and capabilities of Graphical Proc...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
Graphics Processing Units (GPUs) have revolutionized the computing landscape in the past decade and ...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
High performance Computing is increasingly being done on parallel machines like GPUs. In my work, I ...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
We have developed several autotuning benchmarks in CUDA that take into account performance-relevant ...
Optimal performance is an important goal in compute intensive applications. For GPU applications, th...
In high-performance computing, excellent node-level performance is required for the efficient use of...
Writing high performance GPGPU code is often difficult and time-consuming, potentially requiring lab...
We present a novel strategy for automatic performance tuning of GPU programs. The strategy combines ...
GPUs have been used for years in compute intensive applications. Their massive parallel processing c...
Graphics Processing Units (GPUs) have revolutionized the computing landscape over the past decade. H...
The focus of this work is the automatic performance tuning of stencil computations on Graphics Proce...
Finding optimal parameter configurations for tunable GPU kernels is a non-Trivial exercise for large...
Recent years have witnessed phenomenal growth in the application, and capabilities of Graphical Proc...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
Graphics Processing Units (GPUs) have revolutionized the computing landscape in the past decade and ...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
High performance Computing is increasingly being done on parallel machines like GPUs. In my work, I ...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
We have developed several autotuning benchmarks in CUDA that take into account performance-relevant ...
Optimal performance is an important goal in compute intensive applications. For GPU applications, th...
In high-performance computing, excellent node-level performance is required for the efficient use of...
Writing high performance GPGPU code is often difficult and time-consuming, potentially requiring lab...
We present a novel strategy for automatic performance tuning of GPU programs. The strategy combines ...
GPUs have been used for years in compute intensive applications. Their massive parallel processing c...
Graphics Processing Units (GPUs) have revolutionized the computing landscape over the past decade. H...
The focus of this work is the automatic performance tuning of stencil computations on Graphics Proce...
Finding optimal parameter configurations for tunable GPU kernels is a non-Trivial exercise for large...