We consider the problem of computing IEEE floating-point squares by means of integer arithmetic. We show how the specific properties of squaring can be exploited in order to design and implement algorithms that have much lower latency than those for general multiplication, while still guaranteeing correct rounding. Our algorithm descriptions are parameterized by the floating-point format, aim at high instruction-level parallelism (ILP) exposure, and cover all rounding modes. We show further that their C implementation for the binary32 format yields efficient codes for targets like the ST231 VLIW integer processor from STMicroelectronics, with a latency at least 1.75x smaller than that of general multiplication in the same context
©2001 IEEE. Personal use of this material is permitted. However, permission to reprint/republish thi...
Today some embedded systems still do not integrate their own floating-point unit, for area, cost, or...
With continued reductions in feature size, additional functionality may be added to future microproc...
We consider the problem of computing IEEE floating-point squares by means of integer arithmetic. We ...
International audienceWe consider the problem of computing IEEE floating-point squares by means of i...
International audienceThis paper presents some work in progress on fast and accurate floating-point ...
This paper deals with the design and implementation of low latency software for binary floating-poin...
In this paper we show how to reduce the computation of correctly-rounded square roots of binary floa...
We present algorithms for performing the five elementary arithmetic operations (+, -, ×, ÷, and √) i...
International audienceThis paper presents an optimized software implementation of the reciprocal squ...
Modern floating-point multipliers perform rounding in compliance with the IEEE 754 standard. Since r...
We present algorithms for accurately converting floating-point numbers to decimal representation. Th...
International audienceWe analyze two fast and accurate algorithms recently presented by Borges for c...
The representation formats and behaviors of floating point arithmetics available in computers are de...
Invited paper - MACIS 2015 (Sixth International Conference on Mathematical Aspects of Computer and I...
©2001 IEEE. Personal use of this material is permitted. However, permission to reprint/republish thi...
Today some embedded systems still do not integrate their own floating-point unit, for area, cost, or...
With continued reductions in feature size, additional functionality may be added to future microproc...
We consider the problem of computing IEEE floating-point squares by means of integer arithmetic. We ...
International audienceWe consider the problem of computing IEEE floating-point squares by means of i...
International audienceThis paper presents some work in progress on fast and accurate floating-point ...
This paper deals with the design and implementation of low latency software for binary floating-poin...
In this paper we show how to reduce the computation of correctly-rounded square roots of binary floa...
We present algorithms for performing the five elementary arithmetic operations (+, -, ×, ÷, and √) i...
International audienceThis paper presents an optimized software implementation of the reciprocal squ...
Modern floating-point multipliers perform rounding in compliance with the IEEE 754 standard. Since r...
We present algorithms for accurately converting floating-point numbers to decimal representation. Th...
International audienceWe analyze two fast and accurate algorithms recently presented by Borges for c...
The representation formats and behaviors of floating point arithmetics available in computers are de...
Invited paper - MACIS 2015 (Sixth International Conference on Mathematical Aspects of Computer and I...
©2001 IEEE. Personal use of this material is permitted. However, permission to reprint/republish thi...
Today some embedded systems still do not integrate their own floating-point unit, for area, cost, or...
With continued reductions in feature size, additional functionality may be added to future microproc...