Memory devices represent a key component of datacenter total cost of ownership (TCO), and techniques used to reduce errors that occur on these devices increase this cost. Existing approachesto providing reliability for memory devices pessimistically treat all data as equally vulnerable to memoryerrors. Our key insight is that there exists a diverse spectrum of tolerance to memory errors in new data-intensive applications, and that traditional one-size-fits-all memory reliability techniques are inefficient in terms of cost. For example, we found that while traditional error protection increasesmemory system cost by 12.5%, some applications can achieve 99.00% availability on a single server with a large number of memory errors without any err...
Abstract—Memory errors are a major source of reliability problems in current computers. Undetected e...
This paper examines how to design a low-cost and algorithm-based approach that recovers random multi...
......We expect computers to func-tion correctly, despite potential problems like design bugs and ph...
<p>Memory devices represent a key component of datacenter total cost of ownership (TCO), and techniq...
Recent studies estimate that server cost contributes to as much as 57 % of the total cost of ownersh...
<p>Computing systems use dynamic random-access memory (DRAM) as main memory. As prior works have sho...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
The workloads running in the modern data centers of large scale Internet service providers (such asA...
System reliability is becoming a significant concern as technology continues to shrink. This is beca...
Thesis (Ph. D.)--University of Rochester. Dept. of Electrical and Computer Engineering, 2012In moder...
textFuture computing platforms will increasingly demand more stringent memory resiliency mechanisms ...
Advances in hardware have enabled many long-running applications to execute entirely in main memory....
Disaggregated memory leverages recent technology advances in high-density, byte-addressable non-vola...
Memory error exploitations have been around for over 25 years and still rank among the top 3 most da...
Abstract—Memory errors are a major source of reliability problems in current computers. Undetected e...
This paper examines how to design a low-cost and algorithm-based approach that recovers random multi...
......We expect computers to func-tion correctly, despite potential problems like design bugs and ph...
<p>Memory devices represent a key component of datacenter total cost of ownership (TCO), and techniq...
Recent studies estimate that server cost contributes to as much as 57 % of the total cost of ownersh...
<p>Computing systems use dynamic random-access memory (DRAM) as main memory. As prior works have sho...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
The workloads running in the modern data centers of large scale Internet service providers (such asA...
System reliability is becoming a significant concern as technology continues to shrink. This is beca...
Thesis (Ph. D.)--University of Rochester. Dept. of Electrical and Computer Engineering, 2012In moder...
textFuture computing platforms will increasingly demand more stringent memory resiliency mechanisms ...
Advances in hardware have enabled many long-running applications to execute entirely in main memory....
Disaggregated memory leverages recent technology advances in high-density, byte-addressable non-vola...
Memory error exploitations have been around for over 25 years and still rank among the top 3 most da...
Abstract—Memory errors are a major source of reliability problems in current computers. Undetected e...
This paper examines how to design a low-cost and algorithm-based approach that recovers random multi...
......We expect computers to func-tion correctly, despite potential problems like design bugs and ph...