This work introduces a generalized framework for automatically tuning stencil computations to achieve superior performance on a broad range of multicore architectures. Stencil (nearest-neighbor) based kernels constitute the core of many important scientific applications involving block-structured grids. Auto-tuning systems search over optimization strategies to find the combination of tunable parameters that maximizes computational efficiency for a given algorithmic kernel. Although the auto-tuning strategy has been successfully applied to libraries, generalized stencil kernels are not amenable to packaging as libraries. Studied kernels in this work include both memory-bound kernels as well as a computation-bound bilateral filtering kernel....
On multi-core clusters or supercomputers, how to get good performance when running high performance ...
In high-performance computing, excellent node-level performance is required for the efficient use of...
dissertationStencil computations are operations on structured grids. They are frequently found in pa...
Although stencil auto-tuning has shown tremendous potential in effectively utilizing architectural r...
Abstract In this paper, we present PATUS, a code gener-ation and auto-tuning framework for stencil c...
Understanding the most efficient design and utilization of emerging multicore systems is one of the ...
This study focuses on the key numerical technique of stencil computations, used in many different sc...
The recent transformation from an environment where gains in computational performance came from inc...
The focus of this work is the automatic performance tuning of stencil computations on Graphics Proce...
We propose and evaluate a novel strategy for tuning the performance of a class of stencil computatio...
Stencil computations are a class of algorithms operating on multi-dimensional arrays, which update a...
Code transformations, such as loop tiling and loop fusion, are of key importance for the efficient i...
We present an auto-tuning approach to optimize application performance on emerging multicore archite...
Abstract Performance optimization of stencil computations has beenwidely studied in the literature, ...
Stencil-based kernels constitute the core of many scientific applications on block-structured grids....
On multi-core clusters or supercomputers, how to get good performance when running high performance ...
In high-performance computing, excellent node-level performance is required for the efficient use of...
dissertationStencil computations are operations on structured grids. They are frequently found in pa...
Although stencil auto-tuning has shown tremendous potential in effectively utilizing architectural r...
Abstract In this paper, we present PATUS, a code gener-ation and auto-tuning framework for stencil c...
Understanding the most efficient design and utilization of emerging multicore systems is one of the ...
This study focuses on the key numerical technique of stencil computations, used in many different sc...
The recent transformation from an environment where gains in computational performance came from inc...
The focus of this work is the automatic performance tuning of stencil computations on Graphics Proce...
We propose and evaluate a novel strategy for tuning the performance of a class of stencil computatio...
Stencil computations are a class of algorithms operating on multi-dimensional arrays, which update a...
Code transformations, such as loop tiling and loop fusion, are of key importance for the efficient i...
We present an auto-tuning approach to optimize application performance on emerging multicore archite...
Abstract Performance optimization of stencil computations has beenwidely studied in the literature, ...
Stencil-based kernels constitute the core of many scientific applications on block-structured grids....
On multi-core clusters or supercomputers, how to get good performance when running high performance ...
In high-performance computing, excellent node-level performance is required for the efficient use of...
dissertationStencil computations are operations on structured grids. They are frequently found in pa...