Numerical Reproducibility for the Parallel Reduction on Multi- and Many-Core Architectures

Collange, Caroline
Defour, David
Graillat, Stef
Iakymchuk, Roman

Publication date

September 2015

Publisher

HAL CCSD

Abstract

On modern multi-core, many-core, and heterogeneous architectures, floating-point computations, especially reductions, may become non-deterministic and, therefore, non-reproducible mainly due to the non-associativity of floating-point operations. We introduce an approach to compute the correctly rounded sums of large floating-point vectors accurately and efficiently, achieving deterministic results by construction. Our multi-level algorithm consists of two main stages: a filtering stage that relies on fast vectorized floating-point expansions, and an accumulation stage based on superaccumulators in a high-radix carry-save representation. We present implementations on recent Intel desktop and server processors, Intel Xeon Phi accelerators, an...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Numerical Reproducibility for the Parallel Reduction on Multi- and Many-Core Architectures

Abstract

Extracted data

Numerical Reproducibility for the Parallel Reduction on Multi- and Many-Core Architectures

Abstract

Extracted data

Related items

Related items