Understanding the most efficient design and utilization of emerging multicore systems is one of the most challenging questions faced by the mainstream and scientific computing industries in several decades. Our work explores multicore stencil (nearest-neighbor) computations -- a class of algorithms at the heart of many structured grid codes, including PDE solvers. We develop a number of effective optimization strategies, and build an auto-tuning environment that searches over our optimizations and their parameters to minimize runtime, while maximizing performance portability. To evaluate the effectiveness of these strategies we explore the broadest set of multicore architectures in the current HPC literature, including the Intel Clovertown,...
Stencil computations are a class of algorithms operating on multi-dimensional arrays, which update a...
dissertationStencil computations are operations on structured grids. They are frequently found in pa...
The focus of this work is the automatic performance tuning of stencil computations on Graphics Proce...
Understanding the most efficient design and utilization of emerging multicore systems is one of the ...
Although stencil auto-tuning has shown tremendous potential in effectively utilizing architectural r...
This study focuses on the key numerical technique of stencil computations, used in many different sc...
On multi-core clusters or supercomputers, how to get good performance when running high performance ...
We propose and evaluate a novel strategy for tuning the performance of a class of stencil computatio...
International audienceStencil computation represents an important numerical kernel in scientific com...
The recent transformation from an environment where gains in computational performance came from inc...
We present an auto-tuning approach to optimize application performance on emerging multicore archite...
This work introduces a generalized framework for automatically tuning stencil computations to achiev...
International audienceStencil based computation on structured grids is a kernel at the heart of a la...
Stencil-based kernels constitute the core of many scientific applications on block-structured grids....
In this paper, we present Patus, a code generation and auto-tuning framework for stencil computation...
Stencil computations are a class of algorithms operating on multi-dimensional arrays, which update a...
dissertationStencil computations are operations on structured grids. They are frequently found in pa...
The focus of this work is the automatic performance tuning of stencil computations on Graphics Proce...
Understanding the most efficient design and utilization of emerging multicore systems is one of the ...
Although stencil auto-tuning has shown tremendous potential in effectively utilizing architectural r...
This study focuses on the key numerical technique of stencil computations, used in many different sc...
On multi-core clusters or supercomputers, how to get good performance when running high performance ...
We propose and evaluate a novel strategy for tuning the performance of a class of stencil computatio...
International audienceStencil computation represents an important numerical kernel in scientific com...
The recent transformation from an environment where gains in computational performance came from inc...
We present an auto-tuning approach to optimize application performance on emerging multicore archite...
This work introduces a generalized framework for automatically tuning stencil computations to achiev...
International audienceStencil based computation on structured grids is a kernel at the heart of a la...
Stencil-based kernels constitute the core of many scientific applications on block-structured grids....
In this paper, we present Patus, a code generation and auto-tuning framework for stencil computation...
Stencil computations are a class of algorithms operating on multi-dimensional arrays, which update a...
dissertationStencil computations are operations on structured grids. They are frequently found in pa...
The focus of this work is the automatic performance tuning of stencil computations on Graphics Proce...