This paper presents mathematical foundations for the design of a memory controller subcomponent that helps to bridge the processor/memory performance gap for applications with strided access patterns. The Parallel Vector Access (PVA) unit exploits the regularity of vectors or streams to access them efficiently in parallel on a multibank SDRAM memory system. The PVA unit performs scatter/gather operations so that only the elements accessed by the application are transmitted across the system bus. Vector operations are broadcast in parallel to all memory banks, each of which implements an efficient algorithm to determine which vector elements it holds. Earlier performance evaluations have demonstrated that our PVA implementation loads element...
The concept of Parallel Vector (scratch pad) Memories (PVM) was introduced as one solution for Paral...
The bandwidth mismatch between processor and main memory is one major limiting problem. Although str...
The Structured Memory Access (SMS) architecture implementation presented in this thesis is formulate...
This paper presents mathematical foundations for the design of a memory controller subcomponent that...
We are attacking the memory bottleneck by building a “smart ” memory controller that improves effect...
Memory system efficiency is crucial for any processor to achieve high performance, especially in the...
This paper introduces an innovative cache design for vector computers, called prime-mapped cache. By...
Vector supercomputers, which can process large amounts of vector data efficiently, are among the fas...
This paper presents an experimental study on cache memory designs for vector computers. We use an ex...
This paper introduces an innovative cache design for vector computers, called prime-mapped cache. By...
On many commercial supercomputers, several vector register processors share a global highly interlea...
In this work, we propose a Programmable Vector Memory Controller (PVMC), which boosts noncontiguous ...
Single-Instruction-Multiple-Data (SIMD) architectures are widely used to accelerate applications inv...
The purpose of this paper is to show that multi-threading techniques can be applied to a vector proc...
To manage power and memory wall affects, the HPC industry supports FPGA reconfigurable accelerators ...
The concept of Parallel Vector (scratch pad) Memories (PVM) was introduced as one solution for Paral...
The bandwidth mismatch between processor and main memory is one major limiting problem. Although str...
The Structured Memory Access (SMS) architecture implementation presented in this thesis is formulate...
This paper presents mathematical foundations for the design of a memory controller subcomponent that...
We are attacking the memory bottleneck by building a “smart ” memory controller that improves effect...
Memory system efficiency is crucial for any processor to achieve high performance, especially in the...
This paper introduces an innovative cache design for vector computers, called prime-mapped cache. By...
Vector supercomputers, which can process large amounts of vector data efficiently, are among the fas...
This paper presents an experimental study on cache memory designs for vector computers. We use an ex...
This paper introduces an innovative cache design for vector computers, called prime-mapped cache. By...
On many commercial supercomputers, several vector register processors share a global highly interlea...
In this work, we propose a Programmable Vector Memory Controller (PVMC), which boosts noncontiguous ...
Single-Instruction-Multiple-Data (SIMD) architectures are widely used to accelerate applications inv...
The purpose of this paper is to show that multi-threading techniques can be applied to a vector proc...
To manage power and memory wall affects, the HPC industry supports FPGA reconfigurable accelerators ...
The concept of Parallel Vector (scratch pad) Memories (PVM) was introduced as one solution for Paral...
The bandwidth mismatch between processor and main memory is one major limiting problem. Although str...
The Structured Memory Access (SMS) architecture implementation presented in this thesis is formulate...