We investigate the sources of detected unrecoverable errors (DUEs) in graphics processing units (GPUs) exposed to a neutron beam. Illegal memory accesses and interface errors are among the more likely sources of DUEs. Error-correcting code (ECC) increases the launch failure events. Our test procedure has shown that ECC can reduce the DUEs caused by Illegal Address access up to 92% for Kepler and up to 98% for Volta. In addition, we analyze whether the compiler optimizations can impact the DUE sources distribution for the matrix multiplication. We found that the machine codes generated by the different optimization levels can change the DUE source by no more than 24% on average
With shrinking process technology, the primary cause of transient faults in semiconductors shifts aw...
This work was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - ...
We have measured probabilities for proton, neutron and pion beams from accelerators to induce tempor...
We investigate the sources of detected unrecoverable errors (DUEs) in graphics processing units (GPU...
Abstract–Graphics Processing Units specifically designed for High Performance Computing applications...
In this paper, we compare the radiation response of GPUs executing matrix multiplication and FFT alg...
International audienceWe characterize the fault models for Deep Neural Networks (DNNs) in GPUs expos...
Paper accepted on 28th IEEE IOLTS 2022International audienceRISC-V architectures have gained importa...
In this paper, we evaluate the error criticality of radiation-induced errors on modern High-Performa...
Recently, General Purpose Graphic Processing Units (GPGPUs) have begun to be preferred to CPUs for s...
Neutrons may produce charged particles, which can affect modern electronic components. Depending on ...
Thanks to the capability of efficiently executing massive computations in parallel, General Purpose ...
GPU (Graphics Processing Unit) is emerging as an efficient and scalable accelerator for data-paralle...
International audienceThe reliability evaluation of Deep Neural Networks (DNNs) executed on Graphic ...
Constructing a dependable and fault-tolerant system is inherently difficult. Not only should the sys...
With shrinking process technology, the primary cause of transient faults in semiconductors shifts aw...
This work was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - ...
We have measured probabilities for proton, neutron and pion beams from accelerators to induce tempor...
We investigate the sources of detected unrecoverable errors (DUEs) in graphics processing units (GPU...
Abstract–Graphics Processing Units specifically designed for High Performance Computing applications...
In this paper, we compare the radiation response of GPUs executing matrix multiplication and FFT alg...
International audienceWe characterize the fault models for Deep Neural Networks (DNNs) in GPUs expos...
Paper accepted on 28th IEEE IOLTS 2022International audienceRISC-V architectures have gained importa...
In this paper, we evaluate the error criticality of radiation-induced errors on modern High-Performa...
Recently, General Purpose Graphic Processing Units (GPGPUs) have begun to be preferred to CPUs for s...
Neutrons may produce charged particles, which can affect modern electronic components. Depending on ...
Thanks to the capability of efficiently executing massive computations in parallel, General Purpose ...
GPU (Graphics Processing Unit) is emerging as an efficient and scalable accelerator for data-paralle...
International audienceThe reliability evaluation of Deep Neural Networks (DNNs) executed on Graphic ...
Constructing a dependable and fault-tolerant system is inherently difficult. Not only should the sys...
With shrinking process technology, the primary cause of transient faults in semiconductors shifts aw...
This work was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - ...
We have measured probabilities for proton, neutron and pion beams from accelerators to induce tempor...