Hard and Soft Error Resilience for One-sided Dense Linear Algebra Algorithms

Du, Peng

Open PDF

Open link

Publication date

August 2012

Publisher

TRACE: Tennessee Research and Creative Exchange

Language

English

Abstract

Dense matrix factorizations, such as LU, Cholesky and QR, are widely used by scientific applications that require solving systems of linear equations, eigenvalues and linear least squares problems. Such computations are normally carried out on supercomputers, whose ever-growing scale induces a fast decline of the Mean Time To Failure (MTTF). This dissertation develops fault tolerance algorithms for one-sided dense matrix factorizations, which handles Both hard and soft errors. For hard errors, we propose methods based on diskless checkpointing and Algorithm Based Fault Tolerance (ABFT) to provide full matrix protection, including the left and right factor that are normally seen in dense matrix factorizations. A horizontal parallel diskless ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Hard and Soft Error Resilience for One-sided Dense Linear Algebra Algorithms

Abstract

Extracted data

Hard and Soft Error Resilience for One-sided Dense Linear Algebra Algorithms

Abstract

Extracted data

Related items

Related items