HPC device’s reliability is one of the major concerns for supercomputers today and for the next generation. In fact, the high number of devices in large data centers makes the probability of having at least a device corrupted to be very high. In this work, we first evaluate the problem by performing radiation experiments. The data from the experiments give us realistic error rate of HPC devices. Moreover, we evaluate a representative set of algorithms deriving general insights of parallel algorithms and programming approaches reliability. To understand better the problem, we propose a novel methodology to go beyond the quantification of the problem. We qualify the error by evaluating the criticality of each corrupted execution through a ded...
The increasing computing capacity of multicore components like processors and graphics processing un...
In the last decade the dominance of the general computing systems market has being replaced by embed...
Reliability has become one of the main issues for computing devices employed in several domains. Thi...
HPC device’s reliability is one of the major concerns for supercomputers today and for the next gene...
In this paper, we evaluate the error criticality of radiation-induced errors on modern High-Performa...
ISBN 2-913329-58-8This thesis aims at the study of the behavior of digital processors with respect t...
ISBN 2-84813-004-0This thesis is devoted to the study of a software methodology for detection of the...
A high-level C++ hardening library is designed for the protection of critical software against the h...
This thesis aims at the study of the behavior of digital processors with respect to one of the effec...
Graphics Processing Units (GPUs) have moved from being dedicated devices for multi media and gaming ...
The main objective of this thesis is to develop techniques that can beused to analyze and mitigate t...
Nowadays, high-performance microprocessors are demanded in many fields, including those with high-re...
The embedded processors operating in safety- or mission-critical systems are not allowed to fail. An...
This work studies the reliability of embedded systems with approximate computing on software and har...
ARM processors are leaders in embedded systems, delivering high-performance computing, power efficie...
The increasing computing capacity of multicore components like processors and graphics processing un...
In the last decade the dominance of the general computing systems market has being replaced by embed...
Reliability has become one of the main issues for computing devices employed in several domains. Thi...
HPC device’s reliability is one of the major concerns for supercomputers today and for the next gene...
In this paper, we evaluate the error criticality of radiation-induced errors on modern High-Performa...
ISBN 2-913329-58-8This thesis aims at the study of the behavior of digital processors with respect t...
ISBN 2-84813-004-0This thesis is devoted to the study of a software methodology for detection of the...
A high-level C++ hardening library is designed for the protection of critical software against the h...
This thesis aims at the study of the behavior of digital processors with respect to one of the effec...
Graphics Processing Units (GPUs) have moved from being dedicated devices for multi media and gaming ...
The main objective of this thesis is to develop techniques that can beused to analyze and mitigate t...
Nowadays, high-performance microprocessors are demanded in many fields, including those with high-re...
The embedded processors operating in safety- or mission-critical systems are not allowed to fail. An...
This work studies the reliability of embedded systems with approximate computing on software and har...
ARM processors are leaders in embedded systems, delivering high-performance computing, power efficie...
The increasing computing capacity of multicore components like processors and graphics processing un...
In the last decade the dominance of the general computing systems market has being replaced by embed...
Reliability has become one of the main issues for computing devices employed in several domains. Thi...