Several recent publications have shown that hardware faults in the memory subsystem are commonplace. These faults are predicted to become more frequent in future systems that contain orders of magnitude more DRAM and SRAM than found in current memory subsystems. These memory sub-systems will need to provide resilience techniques to toler-ate these faults when deployed in high-performance comput-ing systems and data centers containing tens of thousands of nodes. Therefore, it is critical to understand the efficacy of current hardware resilience techniques to determine whether they will be suitable for future systems. In this paper, we present a study of DRAM and SRAM faults and errors from the field. We use data from two leadership-class hig...
Memory hardware reliability is an indispensable part of whole-system dependability. This paper prese...
Recent studies estimate that server cost contributes to as much as 57 % of the total cost of ownersh...
In recent years, embedded memories are the fastest growing segment of system on chip. They therefore...
Several recent publications confirm that faults are common in high-performance computing systems. Th...
<p>Computing systems use dynamic random-access memory (DRAM) as main memory. As prior works have sho...
Aggressive process scaling and increasing demands of performance/cost efficiency have exacerbated th...
DoctorReliability of a memory subsystem is one of the most important feature to computer system stab...
This paper summarizes our two-year study of corrected and uncor-rected errors on the MareNostrum 3 s...
Abstract: DRAM testing has always been theoretically considered as a subset of general memory testin...
Supercomputers offer new opportunities for scientific computing as they grow in size. However, their...
Abstract—Memory errors are a major source of reliability problems in current computers. Undetected e...
In this paper, we present a novel study on Data Retention Faults (DRFs) in SRAM memories. We analyze...
DRAM scaling has been the prime driver for increasing the capac-ity of main memory system over the p...
Thesis (Ph. D.)--University of Rochester. Dept. of Electrical and Computer Engineering, 2012In moder...
Memory hardware reliability is an indispensable part of whole-system dependability. Its importance...
Memory hardware reliability is an indispensable part of whole-system dependability. This paper prese...
Recent studies estimate that server cost contributes to as much as 57 % of the total cost of ownersh...
In recent years, embedded memories are the fastest growing segment of system on chip. They therefore...
Several recent publications confirm that faults are common in high-performance computing systems. Th...
<p>Computing systems use dynamic random-access memory (DRAM) as main memory. As prior works have sho...
Aggressive process scaling and increasing demands of performance/cost efficiency have exacerbated th...
DoctorReliability of a memory subsystem is one of the most important feature to computer system stab...
This paper summarizes our two-year study of corrected and uncor-rected errors on the MareNostrum 3 s...
Abstract: DRAM testing has always been theoretically considered as a subset of general memory testin...
Supercomputers offer new opportunities for scientific computing as they grow in size. However, their...
Abstract—Memory errors are a major source of reliability problems in current computers. Undetected e...
In this paper, we present a novel study on Data Retention Faults (DRFs) in SRAM memories. We analyze...
DRAM scaling has been the prime driver for increasing the capac-ity of main memory system over the p...
Thesis (Ph. D.)--University of Rochester. Dept. of Electrical and Computer Engineering, 2012In moder...
Memory hardware reliability is an indispensable part of whole-system dependability. Its importance...
Memory hardware reliability is an indispensable part of whole-system dependability. This paper prese...
Recent studies estimate that server cost contributes to as much as 57 % of the total cost of ownersh...
In recent years, embedded memories are the fastest growing segment of system on chip. They therefore...