General-purpose GPU-based systems are highly attractive, as they give potentially massive performance at little cost. Realizing such potential is challenging due to the complexity of programming. This article presents a compiler-based approach to automatically generate optimized OpenCL code from data parallel OpenMP programs for GPUs. A key feature of our scheme is that it leverages existing transformations, especially data transformations, to improve performance on GPU architectures and uses automatic machine learning to build a predictive model to determine if it is worthwhile running the OpenCL code on the GPU or OpenMP code on the multicore host. We applied our approach to the entire NAS parallel benchmark suite and evaluated it on dist...
As an open, royalty-free framework for writing programs that execute across heterogeneous platforms,...
GPGPUs have recently emerged as powerful vehicles for general-purpose high-performance computing. Al...
A major shift in technology from maximizing single-core performance to integrating multiple cores ha...
General purpose GPU based systems are highly attractive as they give potentially massive performance...
General-purpose GPU-based systems are highly attractive, as they give potentially massive performanc...
Heterogeneous computing systems with multiple CPUs and GPUs are increasingly popular. Today, heterog...
Graphics Processing Units (GPU) have been widely adopted to accelerate the execution of HPC workload...
In this work, we evaluate OpenCL as a programming tool for developing performance-portable applicati...
OpenCL has been designed to achieve functional portability across multi-core devices from different ...
Graphics Processing Units (GPUs) have been successfully used to accelerate scientific applications d...
Abstract. Recently, OpenCL, a new open programming standard for GPGPU programming, has become availa...
General purpose Gpus provide massive compute power, but are notoriously difficult to program. In thi...
Despite the fact that GPU was originally intended to be as a co-processor specializing in graphics r...
Heterogeneous multicore architectures with CPU and add-on GPUs or streaming processors are now widel...
Heterogeneous computer systems are ubiquitous in all areas of computing, from mobile to high-perfor...
As an open, royalty-free framework for writing programs that execute across heterogeneous platforms,...
GPGPUs have recently emerged as powerful vehicles for general-purpose high-performance computing. Al...
A major shift in technology from maximizing single-core performance to integrating multiple cores ha...
General purpose GPU based systems are highly attractive as they give potentially massive performance...
General-purpose GPU-based systems are highly attractive, as they give potentially massive performanc...
Heterogeneous computing systems with multiple CPUs and GPUs are increasingly popular. Today, heterog...
Graphics Processing Units (GPU) have been widely adopted to accelerate the execution of HPC workload...
In this work, we evaluate OpenCL as a programming tool for developing performance-portable applicati...
OpenCL has been designed to achieve functional portability across multi-core devices from different ...
Graphics Processing Units (GPUs) have been successfully used to accelerate scientific applications d...
Abstract. Recently, OpenCL, a new open programming standard for GPGPU programming, has become availa...
General purpose Gpus provide massive compute power, but are notoriously difficult to program. In thi...
Despite the fact that GPU was originally intended to be as a co-processor specializing in graphics r...
Heterogeneous multicore architectures with CPU and add-on GPUs or streaming processors are now widel...
Heterogeneous computer systems are ubiquitous in all areas of computing, from mobile to high-perfor...
As an open, royalty-free framework for writing programs that execute across heterogeneous platforms,...
GPGPUs have recently emerged as powerful vehicles for general-purpose high-performance computing. Al...
A major shift in technology from maximizing single-core performance to integrating multiple cores ha...