With technology scaling, maintaining the reliability of dynamic random-access memory (DRAM) has become more challenging. Therefore, on-die error correction codes have been introduced to accommodate reliability issues in DDR5. However, the current solution still suffers from high overhead when a large DRAM capacity is used to deliver high performance. We present a DRAM chip architecture that can track faults at byte-level DRAM cell errors to address this problem. DRAM faults are classified as temporary or permanent in our proposed architecture, with no additional pins and with minor DRAM chip modifications. Hence, we achieve reliability comparable to that of other state-of-the-art solutions while incurring negligible performance and energy o...
Aggressive process scaling and increasing demands of performance/cost efficiency have exacerbated th...
As DRAM cells continue to shrink, they become more susceptible to retention failures. DRAM cells tha...
New memory technologies and processes introduce new defects that cause previously unknown faults. Dy...
DRAM scaling has been the prime driver for increasing the capac-ity of main memory system over the p...
Continued scaling of DRAM technologies induces more faulty DRAM cells than before. These inherent fa...
DoctorReliability of a memory subsystem is one of the most important feature to computer system stab...
Abstract—Memory errors are a major source of reliability problems in current computers. Undetected e...
As technology scaling poses a threat to DRAM scaling due to phys-ical limitations such as limited ch...
<p>Computing systems use dynamic random-access memory (DRAM) as main memory. As prior works have sho...
Memory reliability has been a major design constraint for mission-critical and large-scale systems f...
For decades, main memory has enjoyed the continuous scaling of its physical substrate: DRAM (Dynamic...
With a need to deliver highest quality products operating in all environments, cope with small and u...
DRAMs face several major challenges: On the one hand, DRAM bit cells are leaky and must be refreshed...
As memory technology scales, the demand for higher performance and reliable operation is increasing ...
Most server-grade memory systems provide Chipkill-Correct error protection at the expense of power a...
Aggressive process scaling and increasing demands of performance/cost efficiency have exacerbated th...
As DRAM cells continue to shrink, they become more susceptible to retention failures. DRAM cells tha...
New memory technologies and processes introduce new defects that cause previously unknown faults. Dy...
DRAM scaling has been the prime driver for increasing the capac-ity of main memory system over the p...
Continued scaling of DRAM technologies induces more faulty DRAM cells than before. These inherent fa...
DoctorReliability of a memory subsystem is one of the most important feature to computer system stab...
Abstract—Memory errors are a major source of reliability problems in current computers. Undetected e...
As technology scaling poses a threat to DRAM scaling due to phys-ical limitations such as limited ch...
<p>Computing systems use dynamic random-access memory (DRAM) as main memory. As prior works have sho...
Memory reliability has been a major design constraint for mission-critical and large-scale systems f...
For decades, main memory has enjoyed the continuous scaling of its physical substrate: DRAM (Dynamic...
With a need to deliver highest quality products operating in all environments, cope with small and u...
DRAMs face several major challenges: On the one hand, DRAM bit cells are leaky and must be refreshed...
As memory technology scales, the demand for higher performance and reliable operation is increasing ...
Most server-grade memory systems provide Chipkill-Correct error protection at the expense of power a...
Aggressive process scaling and increasing demands of performance/cost efficiency have exacerbated th...
As DRAM cells continue to shrink, they become more susceptible to retention failures. DRAM cells tha...
New memory technologies and processes introduce new defects that cause previously unknown faults. Dy...