As transistor technology scales ever further, hardware reliability is becoming harder to manage. The effects of soft errors, variability, wear-out, and yield are intensifying to the point where it becomes difficult to harness the benefits of deeper scaling without mechanisms for hardware fault detection and correction. We observe that the combination of emerging applications and emerging many-core architectures makes software recovery a viable and interesting alternative to traditional, hardware-based fault recovery. Emerging applications tend to have few I/O and memory side-effects, which limits the amount of information that needs checkpointing, and they allow discarding individual sub-computations with typically minimal qualitative impa...
As machines increase in scale, it is predicted that failure rates of supercomputers will correspondi...
Recent trends in transistor technology have dictated the constant reduction of device size. One nega...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
Aggressive scaling of CMOS transistors has enabled extensive system integration and building faster ...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
To meet an insatiable consumer demand for greater performance at less power, silicon technology has ...
It is a great challenge to build reliable computer systems with unreliable hardware and buggy softwa...
Conventional CAD methodologies optimize a processor module for correct operation and prohibit timing...
This paper presents ReVive, a novel general-purpose rollback recovery mechanism for shared-memory mu...
Failing hardware is a fact and trends in microprocessor design indicate that the fraction of hardwar...
In recent years, circuit reliability in modern high-performance processors has become increasingly i...
In recent years, circuit reliability in modern high-performance processors has become increasingly i...
The design of microprocessors is undergoing radical changes that affect the performance and reliabil...
The ever-increasing miniaturization of semiconductors has led to important advances in mobile, cloud...
Processor reliability at upcoming technology nodes presents significant challenges to designers from...
As machines increase in scale, it is predicted that failure rates of supercomputers will correspondi...
Recent trends in transistor technology have dictated the constant reduction of device size. One nega...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
Aggressive scaling of CMOS transistors has enabled extensive system integration and building faster ...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
To meet an insatiable consumer demand for greater performance at less power, silicon technology has ...
It is a great challenge to build reliable computer systems with unreliable hardware and buggy softwa...
Conventional CAD methodologies optimize a processor module for correct operation and prohibit timing...
This paper presents ReVive, a novel general-purpose rollback recovery mechanism for shared-memory mu...
Failing hardware is a fact and trends in microprocessor design indicate that the fraction of hardwar...
In recent years, circuit reliability in modern high-performance processors has become increasingly i...
In recent years, circuit reliability in modern high-performance processors has become increasingly i...
The design of microprocessors is undergoing radical changes that affect the performance and reliabil...
The ever-increasing miniaturization of semiconductors has led to important advances in mobile, cloud...
Processor reliability at upcoming technology nodes presents significant challenges to designers from...
As machines increase in scale, it is predicted that failure rates of supercomputers will correspondi...
Recent trends in transistor technology have dictated the constant reduction of device size. One nega...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...