With shrinking device size and increasing complexity, soft errors are becoming an issue in the reliability of digital systems. To make efficient robust systems, it is important to understand how soft errors affect the quality of output for the target applications. Probabilistic inference applications are interesting since they produce non-exact results and yet are useful in many different fields. Our fault injection experiments show that some of these applications can mask or quickly recover from most transient data errors. In addition, their approximate nature enables low cost fault recovery mechanisms for control flow errors. This allows us to use simple software modifications and checkpointing to drastically reduce the number of program ...
Emerging high-performance architectures are anticipated to contain unreliable components that may ex...
With the massive adoption of machine learning (ML) applications in HPC domains, the reliability of M...
We develop a simple model that computes the probability that a strike at the output of a gate has an...
ISBN:978-1-4244-4321-5International audienceEvaluating the potential functional effects of soft erro...
This thesis investigates techniques for making closed loop control systems fault-tolerant and robust...
Traditionally, fault tolerance researchers have made very strict assumptions about program correctne...
Soft errors caused by transient bit flips have the potential to significantly impactan applicalion's...
Soft errors are faults which are not caused by defective hardware, rather they are induced due to no...
Bit flips are known to be a source of strange system behavior, failures, and crashes. They can cause...
In this paper the behavior of assertion-based error detection mechanisms is characterized under faul...
This paper proposes the use of metrics to refine system design for soft errors protection in system ...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
ELLIOTT III, JAMES JOHN. Resilient Iterative Linear Solvers Running Through Errors. (Under the direc...
What is the probability that the execution state of a given microprocessor running a given applicati...
Embedded systems are increasingly deployed in harsh environments that their components were not nece...
Emerging high-performance architectures are anticipated to contain unreliable components that may ex...
With the massive adoption of machine learning (ML) applications in HPC domains, the reliability of M...
We develop a simple model that computes the probability that a strike at the output of a gate has an...
ISBN:978-1-4244-4321-5International audienceEvaluating the potential functional effects of soft erro...
This thesis investigates techniques for making closed loop control systems fault-tolerant and robust...
Traditionally, fault tolerance researchers have made very strict assumptions about program correctne...
Soft errors caused by transient bit flips have the potential to significantly impactan applicalion's...
Soft errors are faults which are not caused by defective hardware, rather they are induced due to no...
Bit flips are known to be a source of strange system behavior, failures, and crashes. They can cause...
In this paper the behavior of assertion-based error detection mechanisms is characterized under faul...
This paper proposes the use of metrics to refine system design for soft errors protection in system ...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
ELLIOTT III, JAMES JOHN. Resilient Iterative Linear Solvers Running Through Errors. (Under the direc...
What is the probability that the execution state of a given microprocessor running a given applicati...
Embedded systems are increasingly deployed in harsh environments that their components were not nece...
Emerging high-performance architectures are anticipated to contain unreliable components that may ex...
With the massive adoption of machine learning (ML) applications in HPC domains, the reliability of M...
We develop a simple model that computes the probability that a strike at the output of a gate has an...