Abstract—Microprocessors are becoming increasingly vulnerable to soft errors due to the current trends of semiconductor technology scaling. Traditional redundant multi-threading architectures provide perfect fault tolerance by re-executing all the computations. However, such a full re-execution technique significantly increases the verification workload on the processor resources, resulting in severe performance degradation. This paper presents a pro-active verification management approach to mitigate the verification workload to increase its performance with a minimal effect on overall reliability. An anomaly-speculation-based filter checker is proposed to guide a verification priority before the re-execution process starts. This technique...
As semiconductor technology scales into the deep submicron regime the occurrence of transient or sof...
Abstract—Symptom-based fault-tolerant methods are attrac-tive, since they replicate a minor part of ...
It is a great challenge to build reliable computer systems with unreliable hardware and buggy softwa...
Traditional fault-tolerant multi-threading architectures provide good fault tolerance by re-executin...
Abstract: Many methods are available to detect silent errors in high-performance computing (HPC) app...
Successive generations of processors use smaller transistors in the quest to make more powerful comp...
International audienceMany methods are available to detect silent errors in high-performance computi...
Resilience has become a critical problem for high performance computing. Checkpointing proto-cols ar...
As device dimensions continue to be aggressively scaled, microprocessors are becoming increasingly v...
This paper proposes the use of metrics to refine system design for soft errors protection in system ...
IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFTS)...
The negative impact of the aggressive scaling of technology nodes on the sensitivity of CMOS devices...
As device dimensions continue to be aggressively scaled, mi-croprocessors are becoming increasingly ...
This paper presents an empirical investigation on the soft error sensitivity (SES) of microprocessor...
Soft errors are an important challenge in contemporary microprocessors. Particle hits on the compone...
As semiconductor technology scales into the deep submicron regime the occurrence of transient or sof...
Abstract—Symptom-based fault-tolerant methods are attrac-tive, since they replicate a minor part of ...
It is a great challenge to build reliable computer systems with unreliable hardware and buggy softwa...
Traditional fault-tolerant multi-threading architectures provide good fault tolerance by re-executin...
Abstract: Many methods are available to detect silent errors in high-performance computing (HPC) app...
Successive generations of processors use smaller transistors in the quest to make more powerful comp...
International audienceMany methods are available to detect silent errors in high-performance computi...
Resilience has become a critical problem for high performance computing. Checkpointing proto-cols ar...
As device dimensions continue to be aggressively scaled, microprocessors are becoming increasingly v...
This paper proposes the use of metrics to refine system design for soft errors protection in system ...
IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFTS)...
The negative impact of the aggressive scaling of technology nodes on the sensitivity of CMOS devices...
As device dimensions continue to be aggressively scaled, mi-croprocessors are becoming increasingly ...
This paper presents an empirical investigation on the soft error sensitivity (SES) of microprocessor...
Soft errors are an important challenge in contemporary microprocessors. Particle hits on the compone...
As semiconductor technology scales into the deep submicron regime the occurrence of transient or sof...
Abstract—Symptom-based fault-tolerant methods are attrac-tive, since they replicate a minor part of ...
It is a great challenge to build reliable computer systems with unreliable hardware and buggy softwa...