Floating-point arithmetic is notoriously non-associative due to the limited precision representation which demands intermediate values be rounded to fit in the available precision. The resulting cyclic dependency in floating-point accumulation inhibits parallelization of the computation, including efficient use of pipelining. In practice, however, we observe that floating-point operations are mostly associative. This observation can be exploited to parallelize floating-point accumulation using a form of optimistic concurrency. In this scheme, we first compute an optimistic associative approximation to the sum and then relax the computation by iteratively propagating errors until the correct sum is obtained. We map this computation to a netw...
Nowadays, parallel computing is ubiquitous in several application fields, both in engineering and sc...
We disclose hardware (HW) intrinsic CPU or DSP instructions architecture and microarchitecture that ...
The problem of exactly summing n floating-point numbers is a fundamental problem that has many appli...
Floating-point arithmetic is notoriously non-associative due to the limited precision representatio...
Abstract—Floating-point arithmetic is notoriously non-associative due to the limited precision repre...
International audienceFloating-point (FP) addition is non-associative and parallel reduction involvi...
The world depends on computers every day to do accurate real-world mathematics. Computers must store...
National audienceOn modern multi-core, many-core, and heterogeneous architectures, floating-point co...
International audienceFloating-point additions in concurrent execution environment are known to be h...
On modern multi-core, many-core, and heterogeneous architectures, floating-point computations, espec...
International audienceGPUs are an important hardware development platform for problems where massive...
International audienceFloating-point operators on FPGAs do not have to be identical to the ones avai...
Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Aggressive pipelining allows FPGAs to achieve high throughput on many digital signal processing appl...
Nowadays, parallel computing is ubiquitous in several application fields, both in engineering and sc...
We disclose hardware (HW) intrinsic CPU or DSP instructions architecture and microarchitecture that ...
The problem of exactly summing n floating-point numbers is a fundamental problem that has many appli...
Floating-point arithmetic is notoriously non-associative due to the limited precision representatio...
Abstract—Floating-point arithmetic is notoriously non-associative due to the limited precision repre...
International audienceFloating-point (FP) addition is non-associative and parallel reduction involvi...
The world depends on computers every day to do accurate real-world mathematics. Computers must store...
National audienceOn modern multi-core, many-core, and heterogeneous architectures, floating-point co...
International audienceFloating-point additions in concurrent execution environment are known to be h...
On modern multi-core, many-core, and heterogeneous architectures, floating-point computations, espec...
International audienceGPUs are an important hardware development platform for problems where massive...
International audienceFloating-point operators on FPGAs do not have to be identical to the ones avai...
Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Aggressive pipelining allows FPGAs to achieve high throughput on many digital signal processing appl...
Nowadays, parallel computing is ubiquitous in several application fields, both in engineering and sc...
We disclose hardware (HW) intrinsic CPU or DSP instructions architecture and microarchitecture that ...
The problem of exactly summing n floating-point numbers is a fundamental problem that has many appli...