International audienceThe advent of extreme scale machines will require the use of parallel resources at an unprecedented scale, probably leading to a high rate of hard faults and soft errors. Handling fully these faults at the computer system level may have a prohibitive cost. High performance computing applications that aim at exploiting all these resources will thus need to be resilient, i.e., be able to compute a correct solution in presence of faults. We focus on numerical linear algebra problems such as the solution of linear systems or eigenproblems that are the innermost numerical kernels in many scientific and engineering applications and also ones of the most time consuming parts. To address hard fault on computing core, we first ...
Devices are increasingly vulnerable to soft errors as their feature sizes shrink. Previously, soft e...
As we stride toward the exascale era, due to increasing complexity of supercomputers, hard and soft ...
Parallel implementations of Krylov subspace methods often help to accelerate the procedure of findin...
International audienceThe advent of extreme scale machines will require the use of parallel resource...
International audienceIn this talk we will discuss possible numerical remedies to survive data loss...
International audienceThe advent of extreme scale machines will require the use of parallel resource...
International audienceIn this talk we will discuss possible numerical remedies to survive data loss ...
International audience: The advent of extreme scale machines will require the use of parallel resour...
International audiencehe advent of extreme scale machines will require the use of parallel resources...
ELLIOTT III, JAMES JOHN. Resilient Iterative Linear Solvers Running Through Errors. (Under the direc...
International audienceAs the computational power of high performance computing (HPC) systems continu...
AbstractIn the multi-peta-flop era for supercomputers, the number of computing cores is growing expo...
As large-scale linear equation systems are pervasive in many scientific fields, great efforts have b...
This paper presents a method to protect iterative solvers from Detected and Uncorrected Errors (DUE)...
We present a fault model designed to bring out the “worst” in iterative solvers based on mathematica...
Devices are increasingly vulnerable to soft errors as their feature sizes shrink. Previously, soft e...
As we stride toward the exascale era, due to increasing complexity of supercomputers, hard and soft ...
Parallel implementations of Krylov subspace methods often help to accelerate the procedure of findin...
International audienceThe advent of extreme scale machines will require the use of parallel resource...
International audienceIn this talk we will discuss possible numerical remedies to survive data loss...
International audienceThe advent of extreme scale machines will require the use of parallel resource...
International audienceIn this talk we will discuss possible numerical remedies to survive data loss ...
International audience: The advent of extreme scale machines will require the use of parallel resour...
International audiencehe advent of extreme scale machines will require the use of parallel resources...
ELLIOTT III, JAMES JOHN. Resilient Iterative Linear Solvers Running Through Errors. (Under the direc...
International audienceAs the computational power of high performance computing (HPC) systems continu...
AbstractIn the multi-peta-flop era for supercomputers, the number of computing cores is growing expo...
As large-scale linear equation systems are pervasive in many scientific fields, great efforts have b...
This paper presents a method to protect iterative solvers from Detected and Uncorrected Errors (DUE)...
We present a fault model designed to bring out the “worst” in iterative solvers based on mathematica...
Devices are increasingly vulnerable to soft errors as their feature sizes shrink. Previously, soft e...
As we stride toward the exascale era, due to increasing complexity of supercomputers, hard and soft ...
Parallel implementations of Krylov subspace methods often help to accelerate the procedure of findin...