International audienceIn this paper, we develop an approach to GPU kernel optimization by focusing on identification of bottleneck resources and determining optimization parameters that can alleviate the bottleneck. Performance modeling for GPUs is done by abstract kernel emulation along with latency/gap modeling of resources. Sensitivity analysis with respect to resource latency/gap parameters is used to predict the bottleneck resource for a given kernel’s execution. The utility of the bottleneck analysis is demonstrated in two contexts: 1) Coupling the new bottleneck-driven optimization strategy with the OpenTuner auto-tuner: experimental results on all kernels from the Rodinia suite and GPU tensor contraction kernels from the NWChem comp...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
This paper presents a novel optimizing compiler for general purpose computation on graphics processi...
GPUs are gaining fast adoption as high-performance computing architectures, mainly because of their ...
Recent years have witnessed phenomenal growth in the application, and capabilities of Graphical Proc...
Abstract- Future computing systems, from handhelds to su-percomputers, will undoubtedly be more para...
Graphics Processing Units (GPUs) have revolutionized the computing landscape in the past decade and ...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
General purpose application development for GPUs (GPGPU) has recently gained momentum as a cost-effe...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
International audienceMany computationally-intensive algorithms benefit from the wide parallelism of...
The increasing programmability, performance, and cost/effectiveness of GPUs have led to a widespread...
High performance Computing is increasingly being done on parallel machines like GPUs. In my work, I ...
Writing high performance GPGPU code is often difficult and time-consuming, potentially requiring lab...
This thesis work is funded by the ANR PetaQCD project. We have mainly worked on two topics of GPU pe...
High-level tools for analyzing and predicting the performance GPU-accelerated applications are scarc...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
This paper presents a novel optimizing compiler for general purpose computation on graphics processi...
GPUs are gaining fast adoption as high-performance computing architectures, mainly because of their ...
Recent years have witnessed phenomenal growth in the application, and capabilities of Graphical Proc...
Abstract- Future computing systems, from handhelds to su-percomputers, will undoubtedly be more para...
Graphics Processing Units (GPUs) have revolutionized the computing landscape in the past decade and ...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
General purpose application development for GPUs (GPGPU) has recently gained momentum as a cost-effe...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
International audienceMany computationally-intensive algorithms benefit from the wide parallelism of...
The increasing programmability, performance, and cost/effectiveness of GPUs have led to a widespread...
High performance Computing is increasingly being done on parallel machines like GPUs. In my work, I ...
Writing high performance GPGPU code is often difficult and time-consuming, potentially requiring lab...
This thesis work is funded by the ANR PetaQCD project. We have mainly worked on two topics of GPU pe...
High-level tools for analyzing and predicting the performance GPU-accelerated applications are scarc...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
This paper presents a novel optimizing compiler for general purpose computation on graphics processi...
GPUs are gaining fast adoption as high-performance computing architectures, mainly because of their ...