Abstract—Power and programming challenges make heterogeneous multi-cores composed of cores and ASICs an attractive alternative to homogeneous multi-cores. Re-cently, multi-purpose loop-based generated accelerators have emerged as an especially attractive accelerator option. They have several assets: short design time (automatic generation), flexibility (multi-purpose) but low configura-tion and routing overhead (unlike FPGAs), computational performance (operations are directly mapped to hardware), and a focus on memory throughput by leveraging loop constructs. However, with multiple streams, the memory behavior of such accelerators can become at least as complex as that of superscalar processors, while they still need to retain the memory o...
Heterogeneous System-on-Chip (SoC) architectures combine general-purpose processors with many accele...
Multithreading is a well-known technique for general-purpose systems to deliver a substantial perfor...
In heterogeneous computer architectures, the serial part of an application is coupled with domain-sp...
International audiencePower and programming challenges make heterogeneous multi-cores composed of co...
Abstract The many-accelerator architecture, mostly composed of general-purpose cores and accelerator...
As memory accesses increasingly limit the overall performance of reconfigurable accelerators, it is ...
We address programming of accelerator-based heterogeneous multiprocessors in the context of computat...
In light of the failure of Dennard scaling and recent slowdown of Moore's Law, both industry and aca...
In this paper, we present a methodology for designing a pipeline of accelerators for an application....
The world needs special-purpose accelerators to meet future constraints on computation and power con...
In modern embedded systems, heterogeneous architectures are crucial in achieving desired performance...
The adoption of High-Level Synthesis (HLS) tools has significantly reduced accelerator design time. ...
In modern system-on-chip architectures, specialized accelerators are increasingly used to improve pe...
With power limitations imposing hard bounds on the amount of a chip that can be powered simultaneous...
This work studies programmability enhancing abstractions in the context of accelerators and heteroge...
Heterogeneous System-on-Chip (SoC) architectures combine general-purpose processors with many accele...
Multithreading is a well-known technique for general-purpose systems to deliver a substantial perfor...
In heterogeneous computer architectures, the serial part of an application is coupled with domain-sp...
International audiencePower and programming challenges make heterogeneous multi-cores composed of co...
Abstract The many-accelerator architecture, mostly composed of general-purpose cores and accelerator...
As memory accesses increasingly limit the overall performance of reconfigurable accelerators, it is ...
We address programming of accelerator-based heterogeneous multiprocessors in the context of computat...
In light of the failure of Dennard scaling and recent slowdown of Moore's Law, both industry and aca...
In this paper, we present a methodology for designing a pipeline of accelerators for an application....
The world needs special-purpose accelerators to meet future constraints on computation and power con...
In modern embedded systems, heterogeneous architectures are crucial in achieving desired performance...
The adoption of High-Level Synthesis (HLS) tools has significantly reduced accelerator design time. ...
In modern system-on-chip architectures, specialized accelerators are increasingly used to improve pe...
With power limitations imposing hard bounds on the amount of a chip that can be powered simultaneous...
This work studies programmability enhancing abstractions in the context of accelerators and heteroge...
Heterogeneous System-on-Chip (SoC) architectures combine general-purpose processors with many accele...
Multithreading is a well-known technique for general-purpose systems to deliver a substantial perfor...
In heterogeneous computer architectures, the serial part of an application is coupled with domain-sp...