Most server-grade memory systems provide Chipkill-Correct error protection at the expense of power and/or performance overhead. In this paper we present low overhead schemes for improving the reliability of commodity DRAM systems with better power and IPC performance compared to Chipkill-Correct solutions. Specifically, we propose two erasure and error correction (E-ECC) schemes for x8 memory systems that have 12.5 % storage overhead and do not require any change in the existing memory architecture. Both schemes have superior error performance due to the use of a strong ECC code, namely, RS(36,32) over GF (28). Scheme 1 acti-vates 18 chips per access and has stronger reliability com-pared to Chipkill-Correct solutions. If the location of th...
textFuture computing platforms will increasingly demand more stringent memory resiliency mechanisms ...
Abstract–Post-silicon healing techniques that rely on built-in redundancy (e.g. row/column redundanc...
International audienceError-correcting codes (ECC) offer an efficient way to improve the reliability...
Servers and HPC systems often use a strong memory error correction code, or ECC, to meet their relia...
Continued scaling of DRAM technologies induces more faulty DRAM cells than before. These inherent fa...
Continued scaling of DRAM technologies induces more faulty DRAM cells than before. These inherent fa...
pre-printMemory system reliability is a serious and growing concern in modern servers. Existing chip...
<p>Reliability is of the utmost importance for safety of electronic systems built for the automotive...
Chipkill correct is an advanced type of error correction in memory that is popular among servers. La...
abstract: Memory systems are becoming increasingly error-prone, and thus guaranteeing their reliabil...
Memory reliability has been a major design constraint for mission-critical and large-scale systems f...
Memory reliability has been a major design constraint for mission-critical and large-scale systems f...
An application may have different sensitivity to faults in different subsets of the data it uses. So...
Growing computer system sizes and levels of integration have made memory reliability a primary conce...
Abstract—With increasing parameter variations in nanometer technologies, on-chip cache in processor ...
textFuture computing platforms will increasingly demand more stringent memory resiliency mechanisms ...
Abstract–Post-silicon healing techniques that rely on built-in redundancy (e.g. row/column redundanc...
International audienceError-correcting codes (ECC) offer an efficient way to improve the reliability...
Servers and HPC systems often use a strong memory error correction code, or ECC, to meet their relia...
Continued scaling of DRAM technologies induces more faulty DRAM cells than before. These inherent fa...
Continued scaling of DRAM technologies induces more faulty DRAM cells than before. These inherent fa...
pre-printMemory system reliability is a serious and growing concern in modern servers. Existing chip...
<p>Reliability is of the utmost importance for safety of electronic systems built for the automotive...
Chipkill correct is an advanced type of error correction in memory that is popular among servers. La...
abstract: Memory systems are becoming increasingly error-prone, and thus guaranteeing their reliabil...
Memory reliability has been a major design constraint for mission-critical and large-scale systems f...
Memory reliability has been a major design constraint for mission-critical and large-scale systems f...
An application may have different sensitivity to faults in different subsets of the data it uses. So...
Growing computer system sizes and levels of integration have made memory reliability a primary conce...
Abstract—With increasing parameter variations in nanometer technologies, on-chip cache in processor ...
textFuture computing platforms will increasingly demand more stringent memory resiliency mechanisms ...
Abstract–Post-silicon healing techniques that rely on built-in redundancy (e.g. row/column redundanc...
International audienceError-correcting codes (ECC) offer an efficient way to improve the reliability...