Abstract—Soft Error Resiliency is a major concern for Pet-ascale high performance computing (HPC) systems. Blue Gene/Q (BG/Q) is the third generation of IBM’s massively parallel, ener-gy efficient Blue Gene series of supercomputers. The principal goal of this work is to understand the interaction between Blue-Gene/Q’s hardware resiliency features and high-performance applications through proton irradiation of a real chip, and soft-ware resiliency inherent in these applications through applica-tion-level fault injection (AFI) experiments. From the proton irradiation experiments we derived that the mean time between correctable errors at sea level of the SRAM-based register files and Level-1 caches for a system similar to the scale of Sequoia...
A method is presented for automated improvement of embedded application reliability. The compilation...
HPC systems are widely used in industrial, economical, and scientific applications, and many of thes...
FPGAs are a ubiquitous electronic component utilised in a wide range of electronic systems across ma...
A mathematical model is described to predict microprocessor fault tolerance under radiation. The mod...
UnrestrictedWith aggressive technology scaling, radiation-induced soft errors have become a major th...
Supercomputers have played an essential role in the progress of science and engineering research. As...
In this paper, we evaluate the error criticality of radiation-induced errors on modern High-Performa...
Current high-performance processors suffer from soft er-ror susceptibility issues which are generate...
The occurrence of transient faults like soft errors in computer circuits poses a significant challen...
Embedded processors had been established as common components in modern systems. Usually, they are p...
We developed a tool for the reliability analysis of SEU effects on the configuration memory of Xili...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
This book introduces the concepts of soft errors in FPGAs, as well as the motivation for using comme...
Radiation-induced Soft Errors are widely known since the advent of dynamic RAM chips. Reconfigurable...
The constantly increasing memory density and performance of recent Field Programmable Gate Arrays (F...
A method is presented for automated improvement of embedded application reliability. The compilation...
HPC systems are widely used in industrial, economical, and scientific applications, and many of thes...
FPGAs are a ubiquitous electronic component utilised in a wide range of electronic systems across ma...
A mathematical model is described to predict microprocessor fault tolerance under radiation. The mod...
UnrestrictedWith aggressive technology scaling, radiation-induced soft errors have become a major th...
Supercomputers have played an essential role in the progress of science and engineering research. As...
In this paper, we evaluate the error criticality of radiation-induced errors on modern High-Performa...
Current high-performance processors suffer from soft er-ror susceptibility issues which are generate...
The occurrence of transient faults like soft errors in computer circuits poses a significant challen...
Embedded processors had been established as common components in modern systems. Usually, they are p...
We developed a tool for the reliability analysis of SEU effects on the configuration memory of Xili...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
This book introduces the concepts of soft errors in FPGAs, as well as the motivation for using comme...
Radiation-induced Soft Errors are widely known since the advent of dynamic RAM chips. Reconfigurable...
The constantly increasing memory density and performance of recent Field Programmable Gate Arrays (F...
A method is presented for automated improvement of embedded application reliability. The compilation...
HPC systems are widely used in industrial, economical, and scientific applications, and many of thes...
FPGAs are a ubiquitous electronic component utilised in a wide range of electronic systems across ma...