International audienceOffloading compute-intensive kernels to hardwareaccelerators relies on the large degree of parallelism offered bythese platforms. However, the effective bandwidth of the memoryinterface often causes a bottleneck, hindering the accelerator’seffective performance. Techniques enabling data reuse, such astiling, lower the pressure on memory traffic but still often leave the accelerators I/O-bound. A further increase in effectivebandwidth is possible by using burst rather than element-wiseaccesses, provided the data is contiguous in memory.In this paper, we propose a memory allocation technique,and provide a proof-of-concept source-to-source compiler pass,that enables such burst transfers by modifying the data layoutin exte...
It is very challenging to design an on-chip memory architecture for high-performance kernels with la...
In the near future, cameras will be used everywhere as flexible sensors for numerous applications. F...
High Level Synthesis tools have reduced accelerator design time. However, a complex scaling problem ...
As memory accesses increasingly limit the overall performance of reconfigurable accelerators, it is ...
The adoption of High-Level Synthesis (HLS) tools has significantly reduced accelerator design time. ...
Commodity accelerator technologies including reconfigurable devices provide an order of magnitude pe...
This dissertation presents a hardware accelerator that is able to accelerate large (including non-pa...
High Level Synthesis tools have reduced accelerator design time. How-ever, a complex scaling problem...
Hardware accelerators such as GPUs and FPGAs can often provide enormous computing capabilities and p...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
FPGA designs have an immense design space, and there can be an order of magnitude performance differ...
International audienceBurst-Buffers are high throughput and small size storage which are being used ...
The performance gap between CPUs, and memory memory has diverged significantly since the 1980's maki...
The effective bandwidth of the FPGA external memory, usually DRAM, is extremely sensitive to the acc...
A key enabler for the ever-increasing adoption of FPGA accelerators is the availability of framework...
It is very challenging to design an on-chip memory architecture for high-performance kernels with la...
In the near future, cameras will be used everywhere as flexible sensors for numerous applications. F...
High Level Synthesis tools have reduced accelerator design time. However, a complex scaling problem ...
As memory accesses increasingly limit the overall performance of reconfigurable accelerators, it is ...
The adoption of High-Level Synthesis (HLS) tools has significantly reduced accelerator design time. ...
Commodity accelerator technologies including reconfigurable devices provide an order of magnitude pe...
This dissertation presents a hardware accelerator that is able to accelerate large (including non-pa...
High Level Synthesis tools have reduced accelerator design time. How-ever, a complex scaling problem...
Hardware accelerators such as GPUs and FPGAs can often provide enormous computing capabilities and p...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
FPGA designs have an immense design space, and there can be an order of magnitude performance differ...
International audienceBurst-Buffers are high throughput and small size storage which are being used ...
The performance gap between CPUs, and memory memory has diverged significantly since the 1980's maki...
The effective bandwidth of the FPGA external memory, usually DRAM, is extremely sensitive to the acc...
A key enabler for the ever-increasing adoption of FPGA accelerators is the availability of framework...
It is very challenging to design an on-chip memory architecture for high-performance kernels with la...
In the near future, cameras will be used everywhere as flexible sensors for numerous applications. F...
High Level Synthesis tools have reduced accelerator design time. However, a complex scaling problem ...