Today’s computers have gigabytes of main memory due to improved DRAM density. As density increases, smaller bit cells become more susceptible to errors. With an increase in error susceptibility, the need for memory resiliency also increases. Self-testing of memory health can proactively check for errors to improve resiliency. Developing a memory diagnostic is challenging due to requirements for transparency, scalability and low performance overheads. In my thesis, I developed a software-only self-test to continuously test memory. I present the challenges and the design for two approaches, called COMeT and Asteroid, that are built on a common software framework for memory diagnostic and target chip multiprocessors. COMeT tests memory health ...
Leveraging Storage Class Memory (SCM) as a universal memory--i.e. as memory and storage at the same ...
This is a fault-tolerant random access memory for use in fault-tolerant computers. It comprises a pl...
<p>Memory devices represent a key component of datacenter total cost of ownership (TCO), and techniq...
Today’s computers have gigabytes of main memory due to improved DRAM density. As density increases, ...
Abstract—Memory errors are a major source of reliability problems in current computers. Undetected e...
[[abstract]]Hundreds of memory cores can be found on a typical system-on-chip (SOC) today. Diagnosin...
Modern memory systems play a critical role in the performance ofapplications, but a detailed underst...
Almost all functional safety standards that regulate safety-critical domains impose to periodically ...
Shrinking transistor sizes have introduced new challenges and opportunities for system-on-chip (SoC)...
Modern memory systems play a critical role in the performance of applications, but a detailed unders...
C is a dominant language for implementing system software. Unfortunately, its support for low-level ...
Scientific workflows have become mainstream for conducting large-scale scientific research. As a res...
Modern software systems are deeply embedded into our daily lives; the failures of these systems can ...
The reliability of memory subsystems is worsening rapidly and needs to be considered as one of the p...
Testing and diagnosis techniques play a key role in the advance of semiconductor memory technology. ...
Leveraging Storage Class Memory (SCM) as a universal memory--i.e. as memory and storage at the same ...
This is a fault-tolerant random access memory for use in fault-tolerant computers. It comprises a pl...
<p>Memory devices represent a key component of datacenter total cost of ownership (TCO), and techniq...
Today’s computers have gigabytes of main memory due to improved DRAM density. As density increases, ...
Abstract—Memory errors are a major source of reliability problems in current computers. Undetected e...
[[abstract]]Hundreds of memory cores can be found on a typical system-on-chip (SOC) today. Diagnosin...
Modern memory systems play a critical role in the performance ofapplications, but a detailed underst...
Almost all functional safety standards that regulate safety-critical domains impose to periodically ...
Shrinking transistor sizes have introduced new challenges and opportunities for system-on-chip (SoC)...
Modern memory systems play a critical role in the performance of applications, but a detailed unders...
C is a dominant language for implementing system software. Unfortunately, its support for low-level ...
Scientific workflows have become mainstream for conducting large-scale scientific research. As a res...
Modern software systems are deeply embedded into our daily lives; the failures of these systems can ...
The reliability of memory subsystems is worsening rapidly and needs to be considered as one of the p...
Testing and diagnosis techniques play a key role in the advance of semiconductor memory technology. ...
Leveraging Storage Class Memory (SCM) as a universal memory--i.e. as memory and storage at the same ...
This is a fault-tolerant random access memory for use in fault-tolerant computers. It comprises a pl...
<p>Memory devices represent a key component of datacenter total cost of ownership (TCO), and techniq...