Understanding the most efficient design and utilization of emerging multicore systems is one of the most chal-lenging questions faced by the mainstream and scientific computing industries in several decades. Our work ex-plores multicore stencil (nearest-neighbor) computations — a class of algorithms at the heart of many structured grid codes, including PDE solvers. We develop a number of effective optimization strategies, and build an auto-tuning environment that searches over our optimizations and their parameters to minimize runtime, while maximizing performance portability. To evaluate the effectiveness of these strategies we explore the broadest set of multicore architectures in the current HPC literature, including th
Stencil-based kernels constitute the core of many scientific applications on block-structured grids....
Stencil computations are a class of algorithms operating on multi-dimensional arrays, which update a...
The focus of this work is the automatic performance tuning of stencil computations on Graphics Proce...
Understanding the most efficient design and utilization of emerging multicore systems is one of the ...
Although stencil auto-tuning has shown tremendous potential in effectively utilizing architectural r...
This study focuses on the key numerical technique of stencil computations, used in many different sc...
This work introduces a generalized framework for automatically tuning stencil computations to achiev...
On multi-core clusters or supercomputers, how to get good performance when running high performance ...
The recent transformation from an environment where gains in computational performance came from inc...
International audienceStencil computation represents an important numerical kernel in scientific com...
We present an auto-tuning approach to optimize application performance on emerging multicore archite...
In this paper, we present Patus, a code generation and auto-tuning framework for stencil computation...
We propose and evaluate a novel strategy for tuning the performance of a class of stencil computatio...
International audienceStencil based computation on structured grids is a kernel at the heart of a la...
dissertationStencil computations are operations on structured grids. They are frequently found in pa...
Stencil-based kernels constitute the core of many scientific applications on block-structured grids....
Stencil computations are a class of algorithms operating on multi-dimensional arrays, which update a...
The focus of this work is the automatic performance tuning of stencil computations on Graphics Proce...
Understanding the most efficient design and utilization of emerging multicore systems is one of the ...
Although stencil auto-tuning has shown tremendous potential in effectively utilizing architectural r...
This study focuses on the key numerical technique of stencil computations, used in many different sc...
This work introduces a generalized framework for automatically tuning stencil computations to achiev...
On multi-core clusters or supercomputers, how to get good performance when running high performance ...
The recent transformation from an environment where gains in computational performance came from inc...
International audienceStencil computation represents an important numerical kernel in scientific com...
We present an auto-tuning approach to optimize application performance on emerging multicore archite...
In this paper, we present Patus, a code generation and auto-tuning framework for stencil computation...
We propose and evaluate a novel strategy for tuning the performance of a class of stencil computatio...
International audienceStencil based computation on structured grids is a kernel at the heart of a la...
dissertationStencil computations are operations on structured grids. They are frequently found in pa...
Stencil-based kernels constitute the core of many scientific applications on block-structured grids....
Stencil computations are a class of algorithms operating on multi-dimensional arrays, which update a...
The focus of this work is the automatic performance tuning of stencil computations on Graphics Proce...