The adoption of High-Level Synthesis (HLS) tools has significantly reduced accelerator design time. A complex scaling problem that remains is the data transfer bottleneck. To scale-up performance accelerators require huge amounts of data, and are often limited by interconnect resources. In addition, the energy spent by the accelerator is often dominated by the transfer of data, either in the form of memory references or data movement on interconnect. In this paper we drastically reduce accelerator communication by exploration of computation reordering and local buffer usage. Consequently, we present a new analytical methodology to optimize nested loops for inter-tile data reuse with loop transformations like interchange and tiling. We focus...
Heterogeneous systems have emerged as state-of-the-art computing solutions. Such systems consist of ...
PPoPP'12 extended versionInternational audienceSome data- and compute-intensive applications can be ...
There is a large, emerging, and commercially relevant class of applications which stands to be enabl...
The adoption of High-Level Synthesis (HLS) tools has significantly reduced accelerator design time. ...
High Level Synthesis tools have reduced accelerator design time. However, a complex scaling problem ...
High Level Synthesis tools have reduced accelerator design time. How-ever, a complex scaling problem...
The demand for high performance has driven acyclic computation accelerators into extensive use in mo...
In light of the failure of Dennard scaling and recent slowdown of Moore's Law, both industry and aca...
Recent research in embedded computing indicates that packing mul-tiple processor cores on the same d...
In modern system-on-chip architectures, specialized accelerators are increasingly used to improve pe...
High-level synthesis (HLS) is well capable of generating control and computation circuits for FPGA a...
Some data- and compute-intensive applications can be ac-celerated by offloading portions of codes to...
This dissertation investigates the communication optimization for customizable domain-specific compu...
In modern embedded systems, heterogeneous architectures are crucial in achieving desired performance...
Heterogeneous multicore systems are becoming increasingly important as the need for computation powe...
Heterogeneous systems have emerged as state-of-the-art computing solutions. Such systems consist of ...
PPoPP'12 extended versionInternational audienceSome data- and compute-intensive applications can be ...
There is a large, emerging, and commercially relevant class of applications which stands to be enabl...
The adoption of High-Level Synthesis (HLS) tools has significantly reduced accelerator design time. ...
High Level Synthesis tools have reduced accelerator design time. However, a complex scaling problem ...
High Level Synthesis tools have reduced accelerator design time. How-ever, a complex scaling problem...
The demand for high performance has driven acyclic computation accelerators into extensive use in mo...
In light of the failure of Dennard scaling and recent slowdown of Moore's Law, both industry and aca...
Recent research in embedded computing indicates that packing mul-tiple processor cores on the same d...
In modern system-on-chip architectures, specialized accelerators are increasingly used to improve pe...
High-level synthesis (HLS) is well capable of generating control and computation circuits for FPGA a...
Some data- and compute-intensive applications can be ac-celerated by offloading portions of codes to...
This dissertation investigates the communication optimization for customizable domain-specific compu...
In modern embedded systems, heterogeneous architectures are crucial in achieving desired performance...
Heterogeneous multicore systems are becoming increasingly important as the need for computation powe...
Heterogeneous systems have emerged as state-of-the-art computing solutions. Such systems consist of ...
PPoPP'12 extended versionInternational audienceSome data- and compute-intensive applications can be ...
There is a large, emerging, and commercially relevant class of applications which stands to be enabl...