Abstract—In a typical GPGPU, the on-chip storage is critical to the massive parallelism and is desired to be large. However, the fast increasing size of the on-chip storage based on traditional SRAM cells, such as register file (RF), shared memory and first level data (L1D) cache, makes the area cost and energy consumption unsustainable for future GPGPUs. In this paper, we first propose to use the embedded-DRAM (eDRAM) as an alternative for the on-chip storage. Compared to the conventional SRAM, eDRAM enables higher density and lower leakage power, but suffers from limited data retention time. Periodic refresh operation is a viable approach to maintain data integrity but aggravates the performance and energy consumption with the scaling of ...
Embedded memories, mostly implemented with static random access memory (SRAM), dominate the area and...
Modern graphic processing units (GPUs) support thousands of concurrent threads and provide high comp...
A DRAM cell requires periodic refresh operations to preserve data in its leaky capacitor. Previously...
The heavily-threaded data processing demands of streaming multiprocessors (SM) in a GPGPU require a ...
The heavily-threaded data processing demands of streaming multiprocessors (SM) in a GPGPU require a ...
GPUs require large register files for fast context switching. This paper presents a high-density and...
An effective approach to reduce the static energy consumption of large on-chip memories is to use a...
An effective approach to reduce the static energy consumption of large on-chip memories is to use a ...
Gain-cell embedded DRAM (GC-eDRAM) is a dense, low power option for embedded memory implementation, ...
Graphics Processing Units (GPUs) and other throughput processing architectures have scaled performan...
A gain-cell embedded DRAM (GC-eDRAM) is an attractive logic-compatible alternative to the convention...
GPU heavily relies on massive multi-threading to achieve high throughput. The massive multi-threadin...
textModern computer systems are power or energy limited. While the number of transistors per chip c...
Recently, general-purpose graphics processing units (GPGPUs) have been widely used to accelerate com...
Current GPU computing models support a mixture of coherent and incoherent classes of memory operatio...
Embedded memories, mostly implemented with static random access memory (SRAM), dominate the area and...
Modern graphic processing units (GPUs) support thousands of concurrent threads and provide high comp...
A DRAM cell requires periodic refresh operations to preserve data in its leaky capacitor. Previously...
The heavily-threaded data processing demands of streaming multiprocessors (SM) in a GPGPU require a ...
The heavily-threaded data processing demands of streaming multiprocessors (SM) in a GPGPU require a ...
GPUs require large register files for fast context switching. This paper presents a high-density and...
An effective approach to reduce the static energy consumption of large on-chip memories is to use a...
An effective approach to reduce the static energy consumption of large on-chip memories is to use a ...
Gain-cell embedded DRAM (GC-eDRAM) is a dense, low power option for embedded memory implementation, ...
Graphics Processing Units (GPUs) and other throughput processing architectures have scaled performan...
A gain-cell embedded DRAM (GC-eDRAM) is an attractive logic-compatible alternative to the convention...
GPU heavily relies on massive multi-threading to achieve high throughput. The massive multi-threadin...
textModern computer systems are power or energy limited. While the number of transistors per chip c...
Recently, general-purpose graphics processing units (GPGPUs) have been widely used to accelerate com...
Current GPU computing models support a mixture of coherent and incoherent classes of memory operatio...
Embedded memories, mostly implemented with static random access memory (SRAM), dominate the area and...
Modern graphic processing units (GPUs) support thousands of concurrent threads and provide high comp...
A DRAM cell requires periodic refresh operations to preserve data in its leaky capacitor. Previously...