This thesis presents and evaluates a bus-based system for FCUDA, a translation tool enabling CUDA code to be run on FPGAs. With the goal of constructing a solid light-weight back-end with optimized performance, we choose AXI4 as the communication protocol and plug in all necessary components on a hierarchical bus system. Then, FCUDA cores are added in the back-end and the comprehensive system is automated into a single tool chain. Several optimizations are added in this automated FCUDA bus system for the delivery of better performance. For example, FCUDA cores are tiled into clusters based on configuration inputs, and clock domains are separated to reduce long wires. For the experiments, this work adjusts the existing resources and period m...
Reconfigurable computing devices can increase the performance of compute intensive algorithms by imp...
With the advances in very large scale integration (VLSI) technology, hardware is going parallel. Sof...
This contribution presents the performance modeling of a super desktop with GPU and FPGA accelerator...
Recent progress in high-level synthesis (HLS) has helped raise the abstraction level of hardware des...
The demand for high-performance computing has been growing significantly in the past decade. The bot...
High-level synthesis (HLS) of data-parallel input languages, such as the Compute Unified Device Arch...
In this report, I will show that the current CUDA-to-FPGA (FCUDA) flow has been tested with a good s...
This dissertation focuses on efficient generation of custom processors from high-level language desc...
We can exploit the standardization of communication abstractions provided by modern high-level synth...
In the design of a multi-processor System-on-a-Chip (SoC), the bus architecture typically comes to t...
After more than 30 years, reconfigurable computing has grown from a concept to a mature field of scien...
Abstract—We can exploit the standardization of communica-tion abstractions provided by modern high-l...
In this paper we present a “high-level ” FPGA architecture description language which lets FPGA arch...
Modern FPGAs that benefit from advancement in process technology and hard IP cores are increasingly ...
The exploding complexity and computation efficiency requirements of applications are stimulating a s...
Reconfigurable computing devices can increase the performance of compute intensive algorithms by imp...
With the advances in very large scale integration (VLSI) technology, hardware is going parallel. Sof...
This contribution presents the performance modeling of a super desktop with GPU and FPGA accelerator...
Recent progress in high-level synthesis (HLS) has helped raise the abstraction level of hardware des...
The demand for high-performance computing has been growing significantly in the past decade. The bot...
High-level synthesis (HLS) of data-parallel input languages, such as the Compute Unified Device Arch...
In this report, I will show that the current CUDA-to-FPGA (FCUDA) flow has been tested with a good s...
This dissertation focuses on efficient generation of custom processors from high-level language desc...
We can exploit the standardization of communication abstractions provided by modern high-level synth...
In the design of a multi-processor System-on-a-Chip (SoC), the bus architecture typically comes to t...
After more than 30 years, reconfigurable computing has grown from a concept to a mature field of scien...
Abstract—We can exploit the standardization of communica-tion abstractions provided by modern high-l...
In this paper we present a “high-level ” FPGA architecture description language which lets FPGA arch...
Modern FPGAs that benefit from advancement in process technology and hard IP cores are increasingly ...
The exploding complexity and computation efficiency requirements of applications are stimulating a s...
Reconfigurable computing devices can increase the performance of compute intensive algorithms by imp...
With the advances in very large scale integration (VLSI) technology, hardware is going parallel. Sof...
This contribution presents the performance modeling of a super desktop with GPU and FPGA accelerator...