Communicated by Guest Editors Our aim is to apply program transformations to stencil codes in order to yield the highest possible performance. We recognize memory bandwidth as a major limitation in stencil code performance. We conducted a study in which we applied optimizing transformations to two Jacobi smoother kernels: one 3D 1st-order 7-point stencil and one 3D 3rd-order 19-point stencil. To obtain high performance, the optimizations have to be customized for the execution platform at hand. We illustrate this by experiments on two consumer and two server architectures. We also verified the need for complex optimizations with the help of the Execution-Cache-Memory performance model. A code generator with knowledge about stencil codes and...
Stencil computations are an integral component of applications in a number of scientific computing d...
Stencil-based kernels constitute the core of many scientific applications on block-structured grids....
State of the art in performance reporting in the High Performance Computing field is omitting detail...
Our aim is to apply program transformations to stencil codes, in order to yield highest possible per...
International audienceStencil computation represents an important numerical kernel in scientific com...
This paper describes a new technique for optimizing serial and parallel stencil- and stencil-like op...
A widely used class of codes are stencil codes. Their general structure is very simple: data points ...
Abstract Performance optimization of stencil computations has beenwidely studied in the literature, ...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
AbstractIt is crucial to optimize stencil computations since they are the core (and most computation...
Code transformations, such as loop tiling and loop fusion, are of key importance for the efficient i...
Performance optimization of stencil computations has been widely studied in the literature, since th...
Earth system modeling computations use stencils extensively while running many kernels. Optimal codi...
Stencil computations are an integral component of applications in a number of scientific computing d...
Stencil-based kernels constitute the core of many scientific applications on block-structured grids....
State of the art in performance reporting in the High Performance Computing field is omitting detail...
Our aim is to apply program transformations to stencil codes, in order to yield highest possible per...
International audienceStencil computation represents an important numerical kernel in scientific com...
This paper describes a new technique for optimizing serial and parallel stencil- and stencil-like op...
A widely used class of codes are stencil codes. Their general structure is very simple: data points ...
Abstract Performance optimization of stencil computations has beenwidely studied in the literature, ...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
AbstractIt is crucial to optimize stencil computations since they are the core (and most computation...
Code transformations, such as loop tiling and loop fusion, are of key importance for the efficient i...
Performance optimization of stencil computations has been widely studied in the literature, since th...
Earth system modeling computations use stencils extensively while running many kernels. Optimal codi...
Stencil computations are an integral component of applications in a number of scientific computing d...
Stencil-based kernels constitute the core of many scientific applications on block-structured grids....
State of the art in performance reporting in the High Performance Computing field is omitting detail...