The automatic generation of hardware implementations for a given algorithm is generally a difficult task, especially when data dependencies span across multiple iterations such as in iterative stencil loops (ISLs). In this paper, we introduce an automatic design flow to extract parallelism from an ISL algorithm and perform a design space exploration to identify its best FPGA hardware implementation, in terms of both area and throughput. Experimental results show that the proposed methodology generates hardware designs whose performance is comparable to the one of manually optimized solutions, and orders of magnitude higher than the implementations generated by commercial high-level synthesis tools
This thesis deals with ways to describe hardware. It presents the methods used in the synthesis of t...
Increases in the capacities and features of FPGAs has opened a new perspective on their use as appli...
Intensive Signal Processing (ISP) applications handle large amounts of data and are characterized by...
A large number of algorithms for multidimensional signals processing and scientific computation come...
A large number of algorithms for multidimensional signals processing and scientific computation come...
Abstract—Real-world applications such as image processing, signal processing, and others often conta...
Real-world applications such as image processing, signal processing, and others often contain a sequ...
Stencil computations are array based algorithms that apply a computation to all array elements in a ...
Abstract—Current tools for High-Level Synthesis (HLS) excel at exploiting Instruction-Level Parallel...
The growing interest in FPGA-based solutions for accelerating compute demanding algorithms is pushin...
In this work, we present a modular software subsystem that exposes a set of APIs for supporting the ...
This paper describes an automated approach to hardware design space exploration, through a collabora...
For decades, the computational performance of processors has grown at a faster rate than the availab...
26th International Conference on Field-Programmable Logic and Applications, FPL 2016, Switzerland, 2...
Due to copyright restrictions, the access to the full text of this article is only available via sub...
This thesis deals with ways to describe hardware. It presents the methods used in the synthesis of t...
Increases in the capacities and features of FPGAs has opened a new perspective on their use as appli...
Intensive Signal Processing (ISP) applications handle large amounts of data and are characterized by...
A large number of algorithms for multidimensional signals processing and scientific computation come...
A large number of algorithms for multidimensional signals processing and scientific computation come...
Abstract—Real-world applications such as image processing, signal processing, and others often conta...
Real-world applications such as image processing, signal processing, and others often contain a sequ...
Stencil computations are array based algorithms that apply a computation to all array elements in a ...
Abstract—Current tools for High-Level Synthesis (HLS) excel at exploiting Instruction-Level Parallel...
The growing interest in FPGA-based solutions for accelerating compute demanding algorithms is pushin...
In this work, we present a modular software subsystem that exposes a set of APIs for supporting the ...
This paper describes an automated approach to hardware design space exploration, through a collabora...
For decades, the computational performance of processors has grown at a faster rate than the availab...
26th International Conference on Field-Programmable Logic and Applications, FPL 2016, Switzerland, 2...
Due to copyright restrictions, the access to the full text of this article is only available via sub...
This thesis deals with ways to describe hardware. It presents the methods used in the synthesis of t...
Increases in the capacities and features of FPGAs has opened a new perspective on their use as appli...
Intensive Signal Processing (ISP) applications handle large amounts of data and are characterized by...