To improve performance and reduce power, processor designers employ advances that shrink feature sizes, lower voltage levels, reduce noise margins, and increase clock rates. However, these advances make processors more susceptible to transient faults that can affect correctness. While reliable systems typically employ hardware tech-niques to address soft-errors, software techniques can pro-vide a lower-cost and more flexible alternative. This paper presents a novel, software-only, transient-fault-detection technique, called SWIFT. SWIFT efficiently manages re-dundancy by reclaiming unused instruction-level resources present during the execution of most programs. SWIFT also provides a high level of protection and performance with an enhanced...
Many current approaches to software-implemented fault tolerance (SIFT) rely on process replication, ...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
Exascale computing systems will require sufficient resilience to tolerate numerous types of hardware...
The general tendency in modern hardware is an increase in fault rates, which is caused by the decrea...
To meet an insatiable consumer demand for greater performance at less power, silicon technology has ...
This article analyzes diverse criteria for effectively implementing selective hardening against soft...
Software-based fault tolerance techniques are a low-cost way to protect processors against soft erro...
Commercial off-the-shelf microprocessors are the core of low-cost embedded systems due to their prog...
Information flow analysis is a widely-adopted technique in software testing and malware analysis. Fo...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
Abstract—Transient faults are emerging as a critical concern in the reliability of general-purpose m...
Embedded systems are increasingly deployed in harsh environments that their components were not nece...
The negative impact of the aggressive scaling of technology nodes on the sensitivity of CMOS devices...
Soft errors (or Transient faults) are temporary faults that arise in a circuit due to a variety of i...
As machines increase in scale, it is predicted that failure rates of supercomputers will correspondi...
Many current approaches to software-implemented fault tolerance (SIFT) rely on process replication, ...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
Exascale computing systems will require sufficient resilience to tolerate numerous types of hardware...
The general tendency in modern hardware is an increase in fault rates, which is caused by the decrea...
To meet an insatiable consumer demand for greater performance at less power, silicon technology has ...
This article analyzes diverse criteria for effectively implementing selective hardening against soft...
Software-based fault tolerance techniques are a low-cost way to protect processors against soft erro...
Commercial off-the-shelf microprocessors are the core of low-cost embedded systems due to their prog...
Information flow analysis is a widely-adopted technique in software testing and malware analysis. Fo...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
Abstract—Transient faults are emerging as a critical concern in the reliability of general-purpose m...
Embedded systems are increasingly deployed in harsh environments that their components were not nece...
The negative impact of the aggressive scaling of technology nodes on the sensitivity of CMOS devices...
Soft errors (or Transient faults) are temporary faults that arise in a circuit due to a variety of i...
As machines increase in scale, it is predicted that failure rates of supercomputers will correspondi...
Many current approaches to software-implemented fault tolerance (SIFT) rely on process replication, ...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
Exascale computing systems will require sufficient resilience to tolerate numerous types of hardware...