Loops are an important source of optimization. In this paper, we propose a new technique for optimizing loops that con-tain kernels mapped on a reconfigurable fabric. We assume theMolen machine organization and programming paradigm as our framework. The method we propose extends our pre-vious work on loop unrolling for reconfigurable architec-tures by combining unrolling with shifting to relocate the function calls contained in the loop body such that in every iteration of the transformed loop, software functions (run-ning on GPP) execute in parallel with multiple instances of the kernel (running on FPGA). The algorithm is based on profiling information about the kernel’s execution times on GPP and FPGA, memory transfers and area utilizatio...
The memory bandwidth largely determines the performance of embedded systems. However, very often com...
Dynamic hardware generation reduces the number of FPGA resources needed and speeds up an application...
With the increasing demand for flexible yet highly efficient architecture platforms for media applic...
. Reconfigurable circuits and systems have evolved from application specific accelerators to a gener...
International audienceThis article studies an important open problem in backend compilation regardin...
Nested loops represent a significant portion of application runtime in multimedia and DSP applicatio...
It is well-known that, to optimize a program for speed-up, efforts should be focused on the regions ...
Reconfigurable computing is a method of development that provides a developer with the ability to re...
In our study, we present the results of the implementation of the SHA-512 algorithm in FPGAs. The di...
Pipelining algorithms are typically concerned with improving only the steady-state performance, or t...
Reconfigurable circuits and systems have evolved from application specific accelerators to a general...
International audienceSoftware pipelining is a powerful technique to expose fine-grain parallelism, ...
International audienceThis paper solves an open problem regarding loop unrolling after periodic regi...
International audienceThis paper improves our previous research effort [1] by providing an efficient...
For loop accelerators such as coarse-grained reconfigurable architectures (CGRAs) and GP-GPUs, neste...
The memory bandwidth largely determines the performance of embedded systems. However, very often com...
Dynamic hardware generation reduces the number of FPGA resources needed and speeds up an application...
With the increasing demand for flexible yet highly efficient architecture platforms for media applic...
. Reconfigurable circuits and systems have evolved from application specific accelerators to a gener...
International audienceThis article studies an important open problem in backend compilation regardin...
Nested loops represent a significant portion of application runtime in multimedia and DSP applicatio...
It is well-known that, to optimize a program for speed-up, efforts should be focused on the regions ...
Reconfigurable computing is a method of development that provides a developer with the ability to re...
In our study, we present the results of the implementation of the SHA-512 algorithm in FPGAs. The di...
Pipelining algorithms are typically concerned with improving only the steady-state performance, or t...
Reconfigurable circuits and systems have evolved from application specific accelerators to a general...
International audienceSoftware pipelining is a powerful technique to expose fine-grain parallelism, ...
International audienceThis paper solves an open problem regarding loop unrolling after periodic regi...
International audienceThis paper improves our previous research effort [1] by providing an efficient...
For loop accelerators such as coarse-grained reconfigurable architectures (CGRAs) and GP-GPUs, neste...
The memory bandwidth largely determines the performance of embedded systems. However, very often com...
Dynamic hardware generation reduces the number of FPGA resources needed and speeds up an application...
With the increasing demand for flexible yet highly efficient architecture platforms for media applic...