This paper focuses on challenging applications that can be expressed as an iterative pipeline of multiple 3d stencil stages and explores their optimization space on GPUs. For this study, we selected a representative example from the field of digital signal processing, the Anisotropic Nonlinear Diffusion algorithm. An open issue to these applications is to determine the optimal fission/fusion level of the involved stages and whether that combination benefits from data tiling. This implies exploring a large space of all the possible fission/fusion combinations with and without tiling, thus making the process non-trivial. This study provides insights to reduce the optimization tuning space and programming effort of iterative multiple 3d stenci...
Over the past decade, computing architectures have continued to exploit multiple levels of paralleli...
The growth of data to be processed in the Oil & Gas industry matches the requirements imposed by evo...
Time-tiling is necessary for the efficient execution of iterative stencil computations. Classical ...
This paper focuses on challenging applications that can be expressed as an iterative pipeline of mul...
The most commonly used approach for solving reaction–diffusion systems relies upon stencil computati...
Artículo presentado al Congreso Español de Informática 2013Performance Analysis of the Multi-pass Tr...
We propose and evaluate a novel strategy for tuning the performance of a class of stencil computatio...
Abstract. During the last decade nonlinear anisotropic diffusion models have shown to be powerful me...
AbstractIn this paper we investigate how stencil computations can be implemented on state-of-the-art...
The focus of this work is the automatic performance tuning of stencil computations on Graphics Proce...
We present an efficient implementation of volumetric anisotropic image diffusion on modern programma...
This report explores using GPUs as a platform for performing high performance medical image data pro...
The quality of an image is highly critical for applications such as robotic vision, surveillance, me...
We present an efficient implementation of volumetric anisotropic image diffusion on modern programma...
Stencil computations are a class of algorithms operating on multi-dimensional arrays, which update a...
Over the past decade, computing architectures have continued to exploit multiple levels of paralleli...
The growth of data to be processed in the Oil & Gas industry matches the requirements imposed by evo...
Time-tiling is necessary for the efficient execution of iterative stencil computations. Classical ...
This paper focuses on challenging applications that can be expressed as an iterative pipeline of mul...
The most commonly used approach for solving reaction–diffusion systems relies upon stencil computati...
Artículo presentado al Congreso Español de Informática 2013Performance Analysis of the Multi-pass Tr...
We propose and evaluate a novel strategy for tuning the performance of a class of stencil computatio...
Abstract. During the last decade nonlinear anisotropic diffusion models have shown to be powerful me...
AbstractIn this paper we investigate how stencil computations can be implemented on state-of-the-art...
The focus of this work is the automatic performance tuning of stencil computations on Graphics Proce...
We present an efficient implementation of volumetric anisotropic image diffusion on modern programma...
This report explores using GPUs as a platform for performing high performance medical image data pro...
The quality of an image is highly critical for applications such as robotic vision, surveillance, me...
We present an efficient implementation of volumetric anisotropic image diffusion on modern programma...
Stencil computations are a class of algorithms operating on multi-dimensional arrays, which update a...
Over the past decade, computing architectures have continued to exploit multiple levels of paralleli...
The growth of data to be processed in the Oil & Gas industry matches the requirements imposed by evo...
Time-tiling is necessary for the efficient execution of iterative stencil computations. Classical ...