Modern high-performance computing architectures (Multicore, GPU, Manycore) are based on tightly-coupled clusters of processing elements, physically implemented as rectangular tiles. Their size and aspect ratio strongly impact the achievable operating frequency and energy efficiency, but they should be as flexible as possible to achieve a high utilization for the top-level die floorplan. In this paper, we explore the flexibility range for a high-performance cluster of RISC-V cores with shared L1 memory used to build scalable accelerators, with the goal of establishing a hierarchical implementation methodology where clusters can be modeled as soft tiles to achieve optimal die utilization.Comment: 6 pages. Accepted for publication in the IEEE ...
With the introduction of more powerful and massively parallel embedded processors, embedded systems ...
The adoption of High-Level Synthesis (HLS) tools has significantly reduced accelerator design time. ...
This paper proposes a well-suited strategy for High Performance Computing (HPC) of density-based top...
FPGA overlays have shown the potential to improve designers’ productivity through balancing flexibil...
With processor clock speeds having stagnated, parallel computing architectures have achieved a break...
Abstract—Common practice for large FPGA design projects is to divide sub-projects into separate synt...
In this paper, we investigate the power implications of tile size selection for tile-based processor...
While parallel architectures based on clusters of Processing Elements (PEs) sharing L1 memory are wi...
Journal ArticleThe ever increasing demand for high clock speeds and the desire to exploit abundant ...
International audienceDuring the past 10 years, the clock frequency of high-end superscalar processo...
A tuned and scalable fast multipole method as a preeminent algorithm for exascale systems Rio Yokota...
Journal Article3D die-stacked chips are emerging as intriguing prospects for the future because of ...
How to effectively use the increasing number of transistors available on a single chip while avoidin...
Embedded computing platforms require to support complex functionalities with high computational thro...
We propose a soft processor programmingmodel and architecture inspired by graphics processing units(...
With the introduction of more powerful and massively parallel embedded processors, embedded systems ...
The adoption of High-Level Synthesis (HLS) tools has significantly reduced accelerator design time. ...
This paper proposes a well-suited strategy for High Performance Computing (HPC) of density-based top...
FPGA overlays have shown the potential to improve designers’ productivity through balancing flexibil...
With processor clock speeds having stagnated, parallel computing architectures have achieved a break...
Abstract—Common practice for large FPGA design projects is to divide sub-projects into separate synt...
In this paper, we investigate the power implications of tile size selection for tile-based processor...
While parallel architectures based on clusters of Processing Elements (PEs) sharing L1 memory are wi...
Journal ArticleThe ever increasing demand for high clock speeds and the desire to exploit abundant ...
International audienceDuring the past 10 years, the clock frequency of high-end superscalar processo...
A tuned and scalable fast multipole method as a preeminent algorithm for exascale systems Rio Yokota...
Journal Article3D die-stacked chips are emerging as intriguing prospects for the future because of ...
How to effectively use the increasing number of transistors available on a single chip while avoidin...
Embedded computing platforms require to support complex functionalities with high computational thro...
We propose a soft processor programmingmodel and architecture inspired by graphics processing units(...
With the introduction of more powerful and massively parallel embedded processors, embedded systems ...
The adoption of High-Level Synthesis (HLS) tools has significantly reduced accelerator design time. ...
This paper proposes a well-suited strategy for High Performance Computing (HPC) of density-based top...