The key common bottleneck in most stencil codes is data movement, and prior research has shown that improving data locality through optimisations that optimise across loops do particularly well. However, in many large PDE applications it is not possible to apply such optimisations through compilers because there are many options, execution paths and data per grid point, many dependent on run-time parameters, and the code is distributed across different compilation units. In this paper, we adapt the data locality improving optimisation called tiling for use in large OPS applications both in shared-memory and distributed-memory systems, relying on run-time analysis and delayed execution. We evaluate our approach on a number of applications, o...
Stencil computations are iterative kernels often used to simulate the change in a discretized spatia...
Spatial computing devices have been shown to significantly accelerate stencil computations, but have...
In the field of scientific computation, loop tiling is an indispensable technique for improving cach...
The key common bottleneck in most stencil codes is data movement, and prior research has shown that ...
Stencil computations are a key class of applications, widely used in the scientific computing commun...
Abstract—Increasingly, the main bottleneck limiting performance on emerging multi-core and many-core...
A widely used class of codes are stencil codes. Their general structure is very simple: data points ...
The Polyhedral model has proven to be a valuable tool for improving memory locality and exploiting p...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
Abstract. This paper proposes tiling techniques based on data depen-dencies and not in code structur...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
In this work, we present Dido, an implicitly parallel domain-specific language (DSL) that captures h...
Stencil computations are a widely used type of algorithm, found in applications from physical simula...
Abstract Performance optimization of stencil computations has beenwidely studied in the literature, ...
In this paper we explore the idea of customizing and reusing loop schedules to improve the scalabili...
Stencil computations are iterative kernels often used to simulate the change in a discretized spatia...
Spatial computing devices have been shown to significantly accelerate stencil computations, but have...
In the field of scientific computation, loop tiling is an indispensable technique for improving cach...
The key common bottleneck in most stencil codes is data movement, and prior research has shown that ...
Stencil computations are a key class of applications, widely used in the scientific computing commun...
Abstract—Increasingly, the main bottleneck limiting performance on emerging multi-core and many-core...
A widely used class of codes are stencil codes. Their general structure is very simple: data points ...
The Polyhedral model has proven to be a valuable tool for improving memory locality and exploiting p...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
Abstract. This paper proposes tiling techniques based on data depen-dencies and not in code structur...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
In this work, we present Dido, an implicitly parallel domain-specific language (DSL) that captures h...
Stencil computations are a widely used type of algorithm, found in applications from physical simula...
Abstract Performance optimization of stencil computations has beenwidely studied in the literature, ...
In this paper we explore the idea of customizing and reusing loop schedules to improve the scalabili...
Stencil computations are iterative kernels often used to simulate the change in a discretized spatia...
Spatial computing devices have been shown to significantly accelerate stencil computations, but have...
In the field of scientific computation, loop tiling is an indispensable technique for improving cach...