The crisis of Moore's law and new dominant Machine Learning workloads require a paradigm shift towards finely tunable-precision (a.k.a. transprecision) computing. More specifically, we need floating-point circuits that are capable to operate on many formats with high flexibility. We present the first silicon implementation of a 64-bit transprecision floating-point unit. It fully supports the standard double, single, and half precision, alongside custom bfloat and 8 bit formats. Operations occur on scalars or 2, 4, or 8-way SIMD vectors. We have integrated the 247 kGE unit into a 64 bit application-class RISC-V processor core, where the added transprecision support accounts for an energy and area overhead of merely 11 and 9, respectively; ye...
Abstract—Energy-efficient computation is critical if we are going to continue to scale performance i...
In the Internet-Of-Things (IoT) domain, microcontrollers (MCUs) are used to collect and process data...
AbstractEmbedded CPUs typically use much less power than desktop or server CPUs but provide limited ...
The crisis of Moore's law and new dominant Machine Learning workloads require a paradigm shift towar...
The slowdown of Moore's law and the power wall necessitates a shift toward finely tunable precision ...
Ultra-low power computing is a key enabler of deeply embedded platforms used in domains such as dist...
In modern low-power embedded platforms, floating-point (FP) operations emerge as a major contributor...
This paper presents the design and the implementation of a fully combinatorial floating point unit (...
Full-precision Floating-Point Units (FPUs) can be a source of extensive hardware overhead in general...
International audienceThis paper proposes an innovative Floating Point (FP) architecture for Variabl...
International audienceFull-precision Floating-Point Units (FPUs) can be a source of extensive hardwa...
Reduced-precision floating-point (FP) arithmetic is being widely adopted to reduce memory footprint ...
ECTI TRANSACTIONS ON COMPUTER AND INFORMATION TECHNOLOGY, VOL.6, NO.1 May 2012This paper presents th...
International audienceIn recent years, Coarse Grain Reconfigurable Architecture (CGRA) accelerators ...
Abstract—Energy-efficient computation is critical if we are going to continue to scale performance i...
In the Internet-Of-Things (IoT) domain, microcontrollers (MCUs) are used to collect and process data...
AbstractEmbedded CPUs typically use much less power than desktop or server CPUs but provide limited ...
The crisis of Moore's law and new dominant Machine Learning workloads require a paradigm shift towar...
The slowdown of Moore's law and the power wall necessitates a shift toward finely tunable precision ...
Ultra-low power computing is a key enabler of deeply embedded platforms used in domains such as dist...
In modern low-power embedded platforms, floating-point (FP) operations emerge as a major contributor...
This paper presents the design and the implementation of a fully combinatorial floating point unit (...
Full-precision Floating-Point Units (FPUs) can be a source of extensive hardware overhead in general...
International audienceThis paper proposes an innovative Floating Point (FP) architecture for Variabl...
International audienceFull-precision Floating-Point Units (FPUs) can be a source of extensive hardwa...
Reduced-precision floating-point (FP) arithmetic is being widely adopted to reduce memory footprint ...
ECTI TRANSACTIONS ON COMPUTER AND INFORMATION TECHNOLOGY, VOL.6, NO.1 May 2012This paper presents th...
International audienceIn recent years, Coarse Grain Reconfigurable Architecture (CGRA) accelerators ...
Abstract—Energy-efficient computation is critical if we are going to continue to scale performance i...
In the Internet-Of-Things (IoT) domain, microcontrollers (MCUs) are used to collect and process data...
AbstractEmbedded CPUs typically use much less power than desktop or server CPUs but provide limited ...