Field-programmable gate arrays (FPGAs) often achieve order of magnitude speedups compared to microprocessors, but typically have been unable to improve the performance of applications with irregular memory access patterns, such as traversals of pointer-based data structures. Due to the common use of these data structures, the applicability and widespread success of FPGAs has been limited. In this paper, we introduce the traversal cache framework – a first step towards improving the performance of FPGA applications that utilize pointer-based data structures. The traversal cache is a local FPGA memory that stores repeated traversals of pointer-based data structures, allowing for these traversals to be efficiently streamed into the FPGA. Altho...
Abstract—Developing FPGA implementations with an input specification in a high-level programming lan...
Field-programmable gate arrays represent an army of logical units which can be organized in a highly...
This archive contains the benchmarks used in the conference paper "Multipurpose Cacheing to accelera...
This dissertation presents a hardware accelerator that is able to accelerate large (including non-pa...
Many algorithms and applications in scientific computing exhibit irregular access patterns as consec...
ABSTRACT Throughput processing involves using many different contexts or threads to solve multiple p...
Caches in FPGAs can improve the performance of soft processors and other applications beset by slow ...
The performance gap between CPUs, and memory memory has diverged significantly since the 1980's maki...
To build a shared-memory programming model for FPGAs, a fast and highly parallel method of accessing...
AbstractTo bridge the ever-increasing performance gap between the processor and the main memory in a...
FPGAs rely on massive datapath parallelism to accelerate applications even with a low clock frequenc...
Where do all the cycles go when microprocessor applications are implemented spatially as circuits on...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
To build a shared-memory programming model for FPGAs, a fast and highly parallel method of accessing...
Abstract—We describe new multi-ported cache designs suit-able for use in FPGA-based processor/parall...
Abstract—Developing FPGA implementations with an input specification in a high-level programming lan...
Field-programmable gate arrays represent an army of logical units which can be organized in a highly...
This archive contains the benchmarks used in the conference paper "Multipurpose Cacheing to accelera...
This dissertation presents a hardware accelerator that is able to accelerate large (including non-pa...
Many algorithms and applications in scientific computing exhibit irregular access patterns as consec...
ABSTRACT Throughput processing involves using many different contexts or threads to solve multiple p...
Caches in FPGAs can improve the performance of soft processors and other applications beset by slow ...
The performance gap between CPUs, and memory memory has diverged significantly since the 1980's maki...
To build a shared-memory programming model for FPGAs, a fast and highly parallel method of accessing...
AbstractTo bridge the ever-increasing performance gap between the processor and the main memory in a...
FPGAs rely on massive datapath parallelism to accelerate applications even with a low clock frequenc...
Where do all the cycles go when microprocessor applications are implemented spatially as circuits on...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
To build a shared-memory programming model for FPGAs, a fast and highly parallel method of accessing...
Abstract—We describe new multi-ported cache designs suit-able for use in FPGA-based processor/parall...
Abstract—Developing FPGA implementations with an input specification in a high-level programming lan...
Field-programmable gate arrays represent an army of logical units which can be organized in a highly...
This archive contains the benchmarks used in the conference paper "Multipurpose Cacheing to accelera...