Abstract-In this paper, we present an overview of interconnect solutions for hardware accelerator systems. A number of solutions are presented: bus-based, DMA, crossbar, NoC, as well as combinations of these. The paper proposes analytical models to predict the performance of these solutions and implements them in practice. The jpeg decoder application is implemented as our case study in different scenarios using the presented interconnect solutions. We profile the application to extract the input data for our analytical model. Measurement results show that the NoC solution combined with a bus-based system provides the best performance as predicted by the analytical models. The NoC solution achieves a speed-up of up to 2.4× compared to the b...
Modern heterogeneous multiprocessors integrate CPU and GPU together to provide a boost to computatio...
This paper focuses on mastering the architecture development of reconfigurable hardware accelerators...
In heterogeneous computer architectures, the serial part of an application is coupled with domain-sp...
Heterogeneous multicore systems are becoming increasingly important as the need for computation powe...
Hardware accelerators are used to speed up execution of specific tasks such as video coding. Often t...
This paper focuses on mastering the architecture development of hardware accelerators. It presents t...
In light of the failure of Dennard scaling and recent slowdown of Moore's Law, both industry and aca...
High performance computing platform is moving from homogeneous individual unites to heterogeneous sy...
In this work, a hybrid CPU/accelerator platform, which runs a standard operating system, is proto-ty...
Connection-oriented Guaranteed-Throughput (GT) mesh-based Networks on Chip (NoCs) have been proposed...
Following trends that emphasize neural networks for machine learning, many studies regarding computi...
Modern computer vision and image processing embedded systems exploit hardware acceleration inside s...
SDR applications are often stream processing applications that are computationally intensive which r...
Efficient data movement in multi-node systems is a crucial issue at the crossroads of scientific com...
The high performance computing landscape is shifting from collections of homogeneous nodes towards h...
Modern heterogeneous multiprocessors integrate CPU and GPU together to provide a boost to computatio...
This paper focuses on mastering the architecture development of reconfigurable hardware accelerators...
In heterogeneous computer architectures, the serial part of an application is coupled with domain-sp...
Heterogeneous multicore systems are becoming increasingly important as the need for computation powe...
Hardware accelerators are used to speed up execution of specific tasks such as video coding. Often t...
This paper focuses on mastering the architecture development of hardware accelerators. It presents t...
In light of the failure of Dennard scaling and recent slowdown of Moore's Law, both industry and aca...
High performance computing platform is moving from homogeneous individual unites to heterogeneous sy...
In this work, a hybrid CPU/accelerator platform, which runs a standard operating system, is proto-ty...
Connection-oriented Guaranteed-Throughput (GT) mesh-based Networks on Chip (NoCs) have been proposed...
Following trends that emphasize neural networks for machine learning, many studies regarding computi...
Modern computer vision and image processing embedded systems exploit hardware acceleration inside s...
SDR applications are often stream processing applications that are computationally intensive which r...
Efficient data movement in multi-node systems is a crucial issue at the crossroads of scientific com...
The high performance computing landscape is shifting from collections of homogeneous nodes towards h...
Modern heterogeneous multiprocessors integrate CPU and GPU together to provide a boost to computatio...
This paper focuses on mastering the architecture development of reconfigurable hardware accelerators...
In heterogeneous computer architectures, the serial part of an application is coupled with domain-sp...