FPGA-based accelerators demonstrated high energy efficiency compared to GPUs and CPUs. However, single FPGA designs may not achieve sufficient task parallelism. In this work, we optimize the mapping of high-performance multi-kernel applications, like Convolutional Neural Networks, to multi-FPGA platforms. First, we formulate the system level optimization problem, choosing within a huge design space the parallelism and number of compute units for each kernel in the pipeline. Then we solve it using a combination of Geometric Programming, producing the optimum performance solution given resource and DRAM bandwidth constraints, and a heuristic allocator of the compute units on the FPGA cluster.Peer ReviewedPostprint (author's final draft
The high demand for addressing the required processing power of today's big-data and compute-intensi...
ARTICo3 is an architecture that permits to dynamically set an arbitrary number of reconfigurable har...
International audienceNext generation FPGA circuits will allow the integration of dozens of hard and...
FPGA-based accelerators demonstrated high energy efficiency compared to GPUs and CPUs. However, sing...
Multi-FPGA platforms, like Amazon AWS F1, can run in the cloud multi-kernel pipelined applications, ...
Platforms with multiple Field Programmable Gate Arrays (FPGAs), such as Amazon Web Services (AWS) F1...
Multi-FPGA platforms like Amazon Web Services F1 are perfect to accelerate multi-kernel pipelined ap...
Multi-FPGA platforms, like Amazon AWS F1, can run in the cloud multikernel pipelined applications, l...
The predictive power of Convolutional Neural Networks (CNNs) has been an integral factor for emergin...
Multi-FPGA platforms like Amazon Web Services F1 are perfect to accelerate multi-kernel pipelined ap...
Heterogeneous chips that combine CPUs and FPGAs can distribute processing so that the algorithm task...
This contribution presents the performance modeling of a super desktop with GPU and FPGA accelerator...
ParaFPGA 2011 marks the third mini-symposium devoted to the methodology, design and implementation o...
This dissertation investigates design target, modeling, and optimization for field-programmable gate...
Nowadays, a new parallel paradigm for energy-efficient heterogeneous hardware infrastructures is req...
The high demand for addressing the required processing power of today's big-data and compute-intensi...
ARTICo3 is an architecture that permits to dynamically set an arbitrary number of reconfigurable har...
International audienceNext generation FPGA circuits will allow the integration of dozens of hard and...
FPGA-based accelerators demonstrated high energy efficiency compared to GPUs and CPUs. However, sing...
Multi-FPGA platforms, like Amazon AWS F1, can run in the cloud multi-kernel pipelined applications, ...
Platforms with multiple Field Programmable Gate Arrays (FPGAs), such as Amazon Web Services (AWS) F1...
Multi-FPGA platforms like Amazon Web Services F1 are perfect to accelerate multi-kernel pipelined ap...
Multi-FPGA platforms, like Amazon AWS F1, can run in the cloud multikernel pipelined applications, l...
The predictive power of Convolutional Neural Networks (CNNs) has been an integral factor for emergin...
Multi-FPGA platforms like Amazon Web Services F1 are perfect to accelerate multi-kernel pipelined ap...
Heterogeneous chips that combine CPUs and FPGAs can distribute processing so that the algorithm task...
This contribution presents the performance modeling of a super desktop with GPU and FPGA accelerator...
ParaFPGA 2011 marks the third mini-symposium devoted to the methodology, design and implementation o...
This dissertation investigates design target, modeling, and optimization for field-programmable gate...
Nowadays, a new parallel paradigm for energy-efficient heterogeneous hardware infrastructures is req...
The high demand for addressing the required processing power of today's big-data and compute-intensi...
ARTICo3 is an architecture that permits to dynamically set an arbitrary number of reconfigurable har...
International audienceNext generation FPGA circuits will allow the integration of dozens of hard and...