Many data structures (e.g., matrices) are typically ac-cessed with multiple access patterns. Depending on the layout of the data structure in physical address space, some access patterns result in non-unit strides. In ex-isting systems, which are optimized to store and access cache lines, non-unit strided accesses exhibit low spatial locality. Therefore, they incur high latency, and waste memory bandwidth and cache space. We propose the Gather-Scatter DRAM (GS-DRAM) to address this problem. We observe that a commodity DRAM module contains many chips. Each chip stores a part of every cache line mapped to the module. Our idea is to enable the memory controller to access multiple val-ues that belong to a strided pattern from different chip
In this paper, we present Bi-Modal Cache - a flexible stacked DRAM cache organization which simultan...
For decades, main memory has enjoyed the continuous scaling of its physical substrate: DRAM (Dynamic...
As device technologies scale in the nanometer era, the current off-chip DRAM technologies are very c...
Die-stacking technology allows conventional DRAM to be integrated with processors. While numerous op...
The effective bandwidth of the FPGA external memory, usually DRAM, is extremely sensitive to the acc...
textContemporary DRAM systems have maintained impressive scaling by managing a careful balance betwe...
Deep cache hierarchies and the latency-tolerating features of modern superscalar microprocessors hid...
Die-stacked DRAM has been proposed for use as a large, high-bandwidth, last-level cache with hundred...
Abstract—This paper analyzes the trade-offs in architecting stacked DRAM either as part of main memo...
Many of the current memory architectures embed a SRAM cache within the DRAM memory. These architectu...
For efficient acceleration on FPGA, it is essential for external memory to match the throughput of t...
The latest CPUs(computer cpu processors) employ multiple cores, massively superscalar pipelines, out...
The Impulse Adaptable Memory System exposes DRAM access patterns not seen in conventional memory sys...
Many algorithms and applications in scientific computing exhibit irregular access patterns as consec...
We propose to overcome the memory capacity limitation of GPUs with a Heterogeneous Memory Stack (HMS...
In this paper, we present Bi-Modal Cache - a flexible stacked DRAM cache organization which simultan...
For decades, main memory has enjoyed the continuous scaling of its physical substrate: DRAM (Dynamic...
As device technologies scale in the nanometer era, the current off-chip DRAM technologies are very c...
Die-stacking technology allows conventional DRAM to be integrated with processors. While numerous op...
The effective bandwidth of the FPGA external memory, usually DRAM, is extremely sensitive to the acc...
textContemporary DRAM systems have maintained impressive scaling by managing a careful balance betwe...
Deep cache hierarchies and the latency-tolerating features of modern superscalar microprocessors hid...
Die-stacked DRAM has been proposed for use as a large, high-bandwidth, last-level cache with hundred...
Abstract—This paper analyzes the trade-offs in architecting stacked DRAM either as part of main memo...
Many of the current memory architectures embed a SRAM cache within the DRAM memory. These architectu...
For efficient acceleration on FPGA, it is essential for external memory to match the throughput of t...
The latest CPUs(computer cpu processors) employ multiple cores, massively superscalar pipelines, out...
The Impulse Adaptable Memory System exposes DRAM access patterns not seen in conventional memory sys...
Many algorithms and applications in scientific computing exhibit irregular access patterns as consec...
We propose to overcome the memory capacity limitation of GPUs with a Heterogeneous Memory Stack (HMS...
In this paper, we present Bi-Modal Cache - a flexible stacked DRAM cache organization which simultan...
For decades, main memory has enjoyed the continuous scaling of its physical substrate: DRAM (Dynamic...
As device technologies scale in the nanometer era, the current off-chip DRAM technologies are very c...