Using FPGA-based acceleration of high-performance computing (HPC) applications to reduce energy and power consumption is becoming an interesting option, thanks to the availability of high-level synthesis (HLS) tools that enable fast design cycles. However, obtaining good performance for memory-intensive algorithms, which often exchange large data arrays with external DRAM, still requires time-consuming optimization and good knowledge of hardware design. This article proposes a new design methodology, based on dedicated application- and data array-specific caches. These caches provide most of the benefits that can be achieved by coding optimized DMA-like transfer strategies by hand into the HPC application code, but require only limited manu...
This contribution presents the performance modeling of a super desktop with GPU and FPGA accelerator...
High performance computing (HPC) demands huge memory bandwidth and computing resources to achieve ma...
For decades, the computational performance of processors has grown at a faster rate than the availab...
Using FPGA-based acceleration of high-performance computing (HPC) applications to reduce energy and ...
Designs implemented on field-programmable gate arrays (FPGAs) via high-level synthesis (HLS) suffer...
Abstract—Developing FPGA implementations with an input specification in a high-level programming lan...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
The demand for scalable, high-performance computing has increased as the size of datasets has grown ...
As we witness the breakdown of Dennard scaling, we can no longer get faster computers by shrinking t...
AbstractTo bridge the ever-increasing performance gap between the processor and the main memory in a...
FPGA-based accelerators have recently evolved as strong competitors to the traditional GPU-based acc...
International audienceDesigning FPGA-based accelerators is a difficult and time-consuming task which...
A hardware implementation can bring orders of magnitude improvements in performance and energy cons...
High-performance computing on heterogeneous platforms in general and those with FPGAs in particular ...
Numerical simulations can help solve complex problems. Most of these algorithms are massively parall...
This contribution presents the performance modeling of a super desktop with GPU and FPGA accelerator...
High performance computing (HPC) demands huge memory bandwidth and computing resources to achieve ma...
For decades, the computational performance of processors has grown at a faster rate than the availab...
Using FPGA-based acceleration of high-performance computing (HPC) applications to reduce energy and ...
Designs implemented on field-programmable gate arrays (FPGAs) via high-level synthesis (HLS) suffer...
Abstract—Developing FPGA implementations with an input specification in a high-level programming lan...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
The demand for scalable, high-performance computing has increased as the size of datasets has grown ...
As we witness the breakdown of Dennard scaling, we can no longer get faster computers by shrinking t...
AbstractTo bridge the ever-increasing performance gap between the processor and the main memory in a...
FPGA-based accelerators have recently evolved as strong competitors to the traditional GPU-based acc...
International audienceDesigning FPGA-based accelerators is a difficult and time-consuming task which...
A hardware implementation can bring orders of magnitude improvements in performance and energy cons...
High-performance computing on heterogeneous platforms in general and those with FPGAs in particular ...
Numerical simulations can help solve complex problems. Most of these algorithms are massively parall...
This contribution presents the performance modeling of a super desktop with GPU and FPGA accelerator...
High performance computing (HPC) demands huge memory bandwidth and computing resources to achieve ma...
For decades, the computational performance of processors has grown at a faster rate than the availab...