Abstract—Local memories increase the efficiency of hardware accelerators by enabling fast accesses to frequently used data. In addition, the access latencies of local memories are deterministic which allows for more accurate evaluation of the system performance during design exploration. We have previously proposed local memories with an un-cached memory slave interface that permits program running on the processor to access the locally stored variables in the hardware accelerator. While this has relaxed the memory constraints for porting code sections to hardware accelerators, there is now a need to consider the read/write access penalties of local memories from the processor during design exploration. In order to facilitate the selection ...
Moore's Law has helped Field Programmable Gate Arrays (FPGAs) scale continuously in speed, capacity ...
Abstract—We describe new multi-ported cache designs suit-able for use in FPGA-based processor/parall...
The performance gap between CPUs, and memory memory has diverged significantly since the 1980's maki...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
Commodity accelerator technologies including reconfigurable devices provide an order of magnitude pe...
The world needs special-purpose accelerators to meet future constraints on computation and power con...
Many algorithms and applications in scientific computing exhibit irregular access patterns as consec...
Reconfigurable heterogeneous systems-on-chips (SoCs) integrating multiple accelerators are cost-effe...
Programmable Systems-on-Chips (SoCs) are expected to incorporate a larger number of application-spec...
The motivation of this research was to evaluate the main memory performance of a hybrid super comput...
FPGA System-on-Chips (SoCs) are heterogeneous platforms that combine general-purpose processors with...
In modern system-on-chip architectures, specialized accelerators are increasingly used to improve pe...
Memory system efficiency is crucial for any processor to achieve high performance, especially in the...
This paper explores an important behavior of memory access instructions, called access region locali...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
Moore's Law has helped Field Programmable Gate Arrays (FPGAs) scale continuously in speed, capacity ...
Abstract—We describe new multi-ported cache designs suit-able for use in FPGA-based processor/parall...
The performance gap between CPUs, and memory memory has diverged significantly since the 1980's maki...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
Commodity accelerator technologies including reconfigurable devices provide an order of magnitude pe...
The world needs special-purpose accelerators to meet future constraints on computation and power con...
Many algorithms and applications in scientific computing exhibit irregular access patterns as consec...
Reconfigurable heterogeneous systems-on-chips (SoCs) integrating multiple accelerators are cost-effe...
Programmable Systems-on-Chips (SoCs) are expected to incorporate a larger number of application-spec...
The motivation of this research was to evaluate the main memory performance of a hybrid super comput...
FPGA System-on-Chips (SoCs) are heterogeneous platforms that combine general-purpose processors with...
In modern system-on-chip architectures, specialized accelerators are increasingly used to improve pe...
Memory system efficiency is crucial for any processor to achieve high performance, especially in the...
This paper explores an important behavior of memory access instructions, called access region locali...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
Moore's Law has helped Field Programmable Gate Arrays (FPGAs) scale continuously in speed, capacity ...
Abstract—We describe new multi-ported cache designs suit-able for use in FPGA-based processor/parall...
The performance gap between CPUs, and memory memory has diverged significantly since the 1980's maki...