In this paper, we evaluate the error criticality of radiation-induced errors on modern High-Performance Computing (HPC) accelerators (Intel Xeon Phi and NVIDIA K40) through a dedicated set of metrics. We show that, as long as imprecise computing is concerned, the simple mismatch detection is not sufficient to evaluate and compare the radiation sensitivity of HPC devices and algorithms. Our analysis quantifies and qualifies radiation effects on applications’ output correlating the number of corrupted elements with their spatial locality. Also, we provide the mean relative error (dataset-wise) to evaluate radiation-induced error magnitude. We apply the selected metrics to experimental results obtained in various radiation test campaigns for ...
International audienceThis work evaluates the error-rate of a memorybound application implemented in...
A mathematical model is described to predict microprocessor fault tolerance under radiation. The mod...
A computer program loads configuration code into a Xilinx field-programmable gate array (FPGA), read...
In this paper, we evaluate the error criticality of radiation-induced errors on modern High-Performa...
HPC device’s reliability is one of the major concerns for supercomputers today and for the next gene...
A mathematical model is described to predict microprocessor fault tolerance under radiation. The mod...
Abstract–Graphics Processing Units specifically designed for High Performance Computing applications...
Paper accepted on 28th IEEE IOLTS 2022International audienceRISC-V architectures have gained importa...
Graphic processing units (GPUs) have become a basic accelerator both in high-performance nodes and l...
This book introduces the concepts of soft errors in FPGAs, as well as the motivation for using comme...
This work proposes a methodology to diagnoseradiation-induced faults in a microprocessor using the h...
We investigate the sources of detected unrecoverable errors (DUEs) in graphics processing units (GPU...
ARM processors are leaders in embedded systems, delivering high-performance computing, power efficie...
All Programmable System-on-Chip (APSoC) devices are designed to provide higher overall programmable ...
A high-level C++ hardening library is designed for the protection of critical software against the ...
International audienceThis work evaluates the error-rate of a memorybound application implemented in...
A mathematical model is described to predict microprocessor fault tolerance under radiation. The mod...
A computer program loads configuration code into a Xilinx field-programmable gate array (FPGA), read...
In this paper, we evaluate the error criticality of radiation-induced errors on modern High-Performa...
HPC device’s reliability is one of the major concerns for supercomputers today and for the next gene...
A mathematical model is described to predict microprocessor fault tolerance under radiation. The mod...
Abstract–Graphics Processing Units specifically designed for High Performance Computing applications...
Paper accepted on 28th IEEE IOLTS 2022International audienceRISC-V architectures have gained importa...
Graphic processing units (GPUs) have become a basic accelerator both in high-performance nodes and l...
This book introduces the concepts of soft errors in FPGAs, as well as the motivation for using comme...
This work proposes a methodology to diagnoseradiation-induced faults in a microprocessor using the h...
We investigate the sources of detected unrecoverable errors (DUEs) in graphics processing units (GPU...
ARM processors are leaders in embedded systems, delivering high-performance computing, power efficie...
All Programmable System-on-Chip (APSoC) devices are designed to provide higher overall programmable ...
A high-level C++ hardening library is designed for the protection of critical software against the ...
International audienceThis work evaluates the error-rate of a memorybound application implemented in...
A mathematical model is described to predict microprocessor fault tolerance under radiation. The mod...
A computer program loads configuration code into a Xilinx field-programmable gate array (FPGA), read...