Stencil operations represent a fundamental class of algorithms in high-performance computing. We are interested in what level of performance can be expected from a highproductivity language such as Chapel. To this effect we discuss four different implementations of a generic stencil operation with a convergence check after each iteration.We start with a sequential implementation followed by a global-view implementation that we experiment with both on a 16-core multi-core system as well as on a cluster with up to 16 such nodes using domain maps. We finish with a local-view implementation that explicitly encodes all design decisions with respect to parallel execution. This paper is set up as a two stage experience report: We mainly report our...
The implementation of stencil computations on modern, massively parallel systems with GPUs and other...
We propose and evaluate a novel strategy for tuning the performance of a class of stencil computatio...
The key common bottleneck in most stencil codes is data movement, and prior research has shown that ...
International audienceThe increase in complexity, diversity and scale of high performance computing ...
Spatial computing devices have been shown to significantly accelerate stencil computations, but have...
Communicated by Guest Editors The implementation of stencil computations on modern, massively parall...
Abstract Performance optimization of stencil computations has beenwidely studied in the literature, ...
AbstractIt is crucial to optimize stencil computations since they are the core (and most computation...
The Chapel programming language provides constructs for expressing a wide range of parallelism patte...
Abstract. This paper proposes tiling techniques based on data depen-dencies and not in code structur...
State of the art in performance reporting in the High Performance Computing field is omitting detail...
Performance optimization of stencil computations has been widely studied in the literature, since th...
The implementation of stencil computations on modern, mas-sively parallel systems with GPUs and othe...
International audienceStencil computation represents an important numerical kernel in scientific com...
Abstract—Computing nodes in reconfigurable clusters are occupied and released by applications during...
The implementation of stencil computations on modern, massively parallel systems with GPUs and other...
We propose and evaluate a novel strategy for tuning the performance of a class of stencil computatio...
The key common bottleneck in most stencil codes is data movement, and prior research has shown that ...
International audienceThe increase in complexity, diversity and scale of high performance computing ...
Spatial computing devices have been shown to significantly accelerate stencil computations, but have...
Communicated by Guest Editors The implementation of stencil computations on modern, massively parall...
Abstract Performance optimization of stencil computations has beenwidely studied in the literature, ...
AbstractIt is crucial to optimize stencil computations since they are the core (and most computation...
The Chapel programming language provides constructs for expressing a wide range of parallelism patte...
Abstract. This paper proposes tiling techniques based on data depen-dencies and not in code structur...
State of the art in performance reporting in the High Performance Computing field is omitting detail...
Performance optimization of stencil computations has been widely studied in the literature, since th...
The implementation of stencil computations on modern, mas-sively parallel systems with GPUs and othe...
International audienceStencil computation represents an important numerical kernel in scientific com...
Abstract—Computing nodes in reconfigurable clusters are occupied and released by applications during...
The implementation of stencil computations on modern, massively parallel systems with GPUs and other...
We propose and evaluate a novel strategy for tuning the performance of a class of stencil computatio...
The key common bottleneck in most stencil codes is data movement, and prior research has shown that ...