The implementation of stencil computations on modern, massively parallel systems with GPUs and other accelerators currently relies on manually-tuned coding using low-level approaches like OpenCL and CUDA. This makes development of stencil applications a complex, time-consuming, and error-prone task. We describe how stencil computations can be programmed in our SkelCL approach that combines high-level programming abstractions with competitive performance on multi-GPU systems. SkelCL extends the OpenCL standard by three high-level features: 1) pre-implemented parallel patterns (a.k.a. skeletons); 2) container data types for vectors and matrices; 3) automatic data (re)distribution mechanism. We introduce two new SkelCL skeleto...
AbstractApplication development for modern high-performance systems with Graphics Processing Units (...
Algorithmic skeletons simplify software development: they abstract typical patterns of parallelism a...
In this paper we advocate the Loop-of-stencil-reduce pattern as a way to simplify the parallel progr...
The implementation of stencil computations on modern, massively parallel systems with GPUs and other...
Communicated by Guest Editors The implementation of stencil computations on modern, massively parall...
The implementation of stencil computations on modern, mas-sively parallel systems with GPUs and othe...
Application development for modern high-performance systems with Graphics Processing Units (GPUs) cu...
While CUDA and OpenCL made general-purpose programming for Graphics Processing Units (GPU) pop...
Application development for modern high-performance systems with Graphics Processing Units (GPUs) cu...
Application programming for GPUs (Graphics Processing Units) is complex and error-prone...
Application programming for GPUs (Graphics Processing Units) is complex and error-prone, becaus...
Application development for modern high-performance systems with many cores, i.e., comprising multip...
Application development for modern high-performance systems with Graphics Processing Units (GPUs) re...
c©2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for a...
Application development for modern high-performance systems with Graphics Processing Units (GPUs) re...
AbstractApplication development for modern high-performance systems with Graphics Processing Units (...
Algorithmic skeletons simplify software development: they abstract typical patterns of parallelism a...
In this paper we advocate the Loop-of-stencil-reduce pattern as a way to simplify the parallel progr...
The implementation of stencil computations on modern, massively parallel systems with GPUs and other...
Communicated by Guest Editors The implementation of stencil computations on modern, massively parall...
The implementation of stencil computations on modern, mas-sively parallel systems with GPUs and othe...
Application development for modern high-performance systems with Graphics Processing Units (GPUs) cu...
While CUDA and OpenCL made general-purpose programming for Graphics Processing Units (GPU) pop...
Application development for modern high-performance systems with Graphics Processing Units (GPUs) cu...
Application programming for GPUs (Graphics Processing Units) is complex and error-prone...
Application programming for GPUs (Graphics Processing Units) is complex and error-prone, becaus...
Application development for modern high-performance systems with many cores, i.e., comprising multip...
Application development for modern high-performance systems with Graphics Processing Units (GPUs) re...
c©2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for a...
Application development for modern high-performance systems with Graphics Processing Units (GPUs) re...
AbstractApplication development for modern high-performance systems with Graphics Processing Units (...
Algorithmic skeletons simplify software development: they abstract typical patterns of parallelism a...
In this paper we advocate the Loop-of-stencil-reduce pattern as a way to simplify the parallel progr...