With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-the-field faults. To be broadly deployable, the hardware reliability solution must incur low overheads, precluding use of expensive redundancy. We explore a co-designed hardware-software solution that treats most hardware faults as software bugs and leverages common mechanisms for hardware and software reliability, thereby amortizing some of the overhead. Fundamental to such a solution is a characterization of how hardware faults in different microarchitectural structures of a modern processor propagate through the application and OS. This paper aims to provide such a characterization, identify low-cost detection methods to intercept fault propagation...
Intermittent hardware faults are hard to diagnose as they occur non-deterministically. Hardware-only...
As chip technology keeps on shrinking towards higher densities and lower operating vol- tages, memo...
Transient faults are emerging as a critical reliability concern for modern microproces-sors. Recentl...
With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-the-field...
Hardware errors are projected to increase in modern computer systems due to shrinking feature sizes ...
As chip densities and clock rates increase, processors are becoming more susceptible to transient fa...
Abstract—As silicon technology continues to scale down and validation expenses continue to increase,...
It is a great challenge to build reliable computer systems with unreliable hardware and buggy softwa...
International audienceThis paper presents a non-intrusive hybrid fault detection approach that combi...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
Abstract—Intermittent hardware faults are bursts of errors that last from a few CPU cycles to a few ...
This paper presents a non-intrusive hybrid fault detection approach that combines hardware and softw...
Over three decades of continuous scaling in CMOS technology has led to tremendous improvements in pr...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
Abstract—Intermittent hardware faults are hard to diagnose as they occur non-deterministically at th...
Intermittent hardware faults are hard to diagnose as they occur non-deterministically. Hardware-only...
As chip technology keeps on shrinking towards higher densities and lower operating vol- tages, memo...
Transient faults are emerging as a critical reliability concern for modern microproces-sors. Recentl...
With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-the-field...
Hardware errors are projected to increase in modern computer systems due to shrinking feature sizes ...
As chip densities and clock rates increase, processors are becoming more susceptible to transient fa...
Abstract—As silicon technology continues to scale down and validation expenses continue to increase,...
It is a great challenge to build reliable computer systems with unreliable hardware and buggy softwa...
International audienceThis paper presents a non-intrusive hybrid fault detection approach that combi...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
Abstract—Intermittent hardware faults are bursts of errors that last from a few CPU cycles to a few ...
This paper presents a non-intrusive hybrid fault detection approach that combines hardware and softw...
Over three decades of continuous scaling in CMOS technology has led to tremendous improvements in pr...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
Abstract—Intermittent hardware faults are hard to diagnose as they occur non-deterministically at th...
Intermittent hardware faults are hard to diagnose as they occur non-deterministically. Hardware-only...
As chip technology keeps on shrinking towards higher densities and lower operating vol- tages, memo...
Transient faults are emerging as a critical reliability concern for modern microproces-sors. Recentl...