Stencil computations are a key class of applications, widely used in the scientific computing community, and a class that has particularly benefited from performance improvements on architectures with high memory bandwidth. Unfortunately, such architectures come with a limited amount of fast memory, which is limiting the size of the problems that can be efficiently solved. In this paper, we address this challenge by applying the well-known cache-blocking tiling technique to large scale stencil codes implemented using the OPS domain specific language, such as CloverLeaf 2D, CloverLeaf 3D, and OpenSBLI. We introduce a number of techniques and optimisations to help manage data resident in fast memory, and minimise data movement. Evaluating our...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
Communicated by Guest Editors The implementation of stencil computations on modern, massively parall...
Stencil computations arise in many scientific computing do-mains, and often represent time-critical ...
The key common bottleneck in most stencil codes is data movement, and prior research has shown that ...
AbstractIn this paper we investigate how stencil computations can be implemented on state-of-the-art...
Stencil computations form the basis for computer simulations across almost every field of science, s...
The implementation of stencil computations on modern, massively parallel systems with GPUs and other...
Stencil computations form the basis for computer simulations across almost every field of science, s...
Summary Stencil computation is of paramount importance in many fields, in image processing, structur...
Stencil computations are commonly used in a wide variety of scientific applications, ranging from la...
Stencil computation is one of the most used kernels in a wide variety of scientific applications, ra...
This paper describes a new technique for optimizing serial and parallel stencil- and stencil-like op...
International audienceIn this paper we propose a design template for stencil computations targeting ...
We are witnessing a fundamental paradigm shift in computer design. Memory has been and is becoming m...
AbstractIt is crucial to optimize stencil computations since they are the core (and most computation...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
Communicated by Guest Editors The implementation of stencil computations on modern, massively parall...
Stencil computations arise in many scientific computing do-mains, and often represent time-critical ...
The key common bottleneck in most stencil codes is data movement, and prior research has shown that ...
AbstractIn this paper we investigate how stencil computations can be implemented on state-of-the-art...
Stencil computations form the basis for computer simulations across almost every field of science, s...
The implementation of stencil computations on modern, massively parallel systems with GPUs and other...
Stencil computations form the basis for computer simulations across almost every field of science, s...
Summary Stencil computation is of paramount importance in many fields, in image processing, structur...
Stencil computations are commonly used in a wide variety of scientific applications, ranging from la...
Stencil computation is one of the most used kernels in a wide variety of scientific applications, ra...
This paper describes a new technique for optimizing serial and parallel stencil- and stencil-like op...
International audienceIn this paper we propose a design template for stencil computations targeting ...
We are witnessing a fundamental paradigm shift in computer design. Memory has been and is becoming m...
AbstractIt is crucial to optimize stencil computations since they are the core (and most computation...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
Communicated by Guest Editors The implementation of stencil computations on modern, massively parall...
Stencil computations arise in many scientific computing do-mains, and often represent time-critical ...