Heavily quantized fixed-point arithmetic is becoming a common approach to deploy Convolutional Neural Networks (CNNs) on limited-memory low-power IoT end-nodes. However, this trend is narrowed by the lack of support for low-bitwidth in the arithmetic units of state-of-the-art embedded Microcontrollers (MCUs). This work proposes a multi-precision arithmetic unit fully integrated into a RISC-V processor at the micro-architectural and ISA level to boost the efficiency of heavily Quantized Neural Network (QNN) inference on microcontroller-class cores. By extending the ISA with nibble (4-bit) and crumb (2-bit) SIMD instructions, we show near-linear speedup with respect to higher precision integer computation on the key kernels for QNN computatio...
Embedding intelligence in extreme edge devices allows distilling raw data acquired from sensors int...
none3noDeep Neural Networks (DNNs) computation-hungry algorithms demand hardware platforms capable o...
Machine Learning (ML) functions are becoming ubiquitous in latency- and privacy-sensitive IoT applic...
Heavily quantized fixed-point arithmetic is becoming a common approach to deploy Convolutional Neura...
Low bit-width Quantized Neural Networks (QNNs) enable deployment of complex machine learning models ...
The deployment of Quantized Neural Networks (QNN) on advanced microcontrollers requires optimized so...
High energy efficiency and low memory footprint are the key requirements for the deployment of deep ...
Deep Learning is moving to edge devices, ushering in a new age of distributed Artificial Intelligenc...
International audienceA lot of recent progress has been made in ultra lowbit quantization, promising...
We present PULP-NN, a multicore computing library for a parallel ultra-low-power cluster of RISC-V b...
Deep Learning is moving to edge devices, ushering in a new age of distributed Artificial Intelligenc...
As AI applications become more prevalent and powerful, the performance of deep learning neural netwo...
The current trend for deep learning has come with an enormous computational need for billions of Mul...
Microcontroller Units (MCUs) in edge devices are resource constrained due to their limited memory fo...
We present PULP-NN, an optimized computing library for a parallel ultra-low-power tightly coupled cl...
Embedding intelligence in extreme edge devices allows distilling raw data acquired from sensors int...
none3noDeep Neural Networks (DNNs) computation-hungry algorithms demand hardware platforms capable o...
Machine Learning (ML) functions are becoming ubiquitous in latency- and privacy-sensitive IoT applic...
Heavily quantized fixed-point arithmetic is becoming a common approach to deploy Convolutional Neura...
Low bit-width Quantized Neural Networks (QNNs) enable deployment of complex machine learning models ...
The deployment of Quantized Neural Networks (QNN) on advanced microcontrollers requires optimized so...
High energy efficiency and low memory footprint are the key requirements for the deployment of deep ...
Deep Learning is moving to edge devices, ushering in a new age of distributed Artificial Intelligenc...
International audienceA lot of recent progress has been made in ultra lowbit quantization, promising...
We present PULP-NN, a multicore computing library for a parallel ultra-low-power cluster of RISC-V b...
Deep Learning is moving to edge devices, ushering in a new age of distributed Artificial Intelligenc...
As AI applications become more prevalent and powerful, the performance of deep learning neural netwo...
The current trend for deep learning has come with an enormous computational need for billions of Mul...
Microcontroller Units (MCUs) in edge devices are resource constrained due to their limited memory fo...
We present PULP-NN, an optimized computing library for a parallel ultra-low-power tightly coupled cl...
Embedding intelligence in extreme edge devices allows distilling raw data acquired from sensors int...
none3noDeep Neural Networks (DNNs) computation-hungry algorithms demand hardware platforms capable o...
Machine Learning (ML) functions are becoming ubiquitous in latency- and privacy-sensitive IoT applic...