A large number of algorithms for multidimensional signals processing and scientific computation come in the form of iterative stencil loops (ISLs), whose data dependencies span across multiple iterations. Because of their complex inner structure, automatic hardware acceleration of such algorithms is traditionally considered as a difficult task. In this paper, we introduce an automatic design flow that identifies, in a wide family of bidimensional data processing algorithms, sub-portions that exhibit a kind of parallelism close to that of ISLs; these are mapped onto a space of highly optimized ad-hoc architectures, which is efficiently explored to identify the best implementations with respect to both area and throughput. Experimental result...
Stencil computations represent a highly recurrent class of algorithms in various high performance co...
For decades, the computational performance of processors has grown at a faster rate than the availab...
Real-world applications such as image processing, signal processing, and others often contain a sequ...
A large number of algorithms for multidimensional signals processing and scientific computation come...
A large number of algorithms for multidimensional signals processing and scientific computation come...
The automatic generation of hardware implementations for a given algorithm is generally a difficult ...
Stencil computations are array based algorithms that apply a computation to all array elements in a ...
AbstractIn this paper we investigate how stencil computations can be implemented on state-of-the-art...
Traditionally, parallel implementations of multimedia algorithms are carried out manually, since the...
Hardware acceleration is the use of custom hardware architectures to perform some computations faste...
International audienceIn this paper we propose a design template for stencil computations targeting ...
Iterative stencils represent the core computational kernel of many applications belonging to differe...
International audienceRecent increase in the complexity of the circuits has brought high-level synth...
The Texas Instruments C66x Digital Signal Processor (DSP) is an embedded processor technology that i...
We advocate the Loop-of-stencil-reduce pattern as a means of simplifying the implementation of data-...
Stencil computations represent a highly recurrent class of algorithms in various high performance co...
For decades, the computational performance of processors has grown at a faster rate than the availab...
Real-world applications such as image processing, signal processing, and others often contain a sequ...
A large number of algorithms for multidimensional signals processing and scientific computation come...
A large number of algorithms for multidimensional signals processing and scientific computation come...
The automatic generation of hardware implementations for a given algorithm is generally a difficult ...
Stencil computations are array based algorithms that apply a computation to all array elements in a ...
AbstractIn this paper we investigate how stencil computations can be implemented on state-of-the-art...
Traditionally, parallel implementations of multimedia algorithms are carried out manually, since the...
Hardware acceleration is the use of custom hardware architectures to perform some computations faste...
International audienceIn this paper we propose a design template for stencil computations targeting ...
Iterative stencils represent the core computational kernel of many applications belonging to differe...
International audienceRecent increase in the complexity of the circuits has brought high-level synth...
The Texas Instruments C66x Digital Signal Processor (DSP) is an embedded processor technology that i...
We advocate the Loop-of-stencil-reduce pattern as a means of simplifying the implementation of data-...
Stencil computations represent a highly recurrent class of algorithms in various high performance co...
For decades, the computational performance of processors has grown at a faster rate than the availab...
Real-world applications such as image processing, signal processing, and others often contain a sequ...