IEEEWe introduce an area/energy-efficient precisionscalable neural network accelerator architecture. Previous precision-scalable hardware accelerators have limitations such as the under-utilization of multipliers for low bit-width operations and the large area overhead to support various bit precisions. To mitigate the problems, we first propose a bitwise summation, which reduces the area overhead for the bit-width scaling. In addition, we present a channel-wise aligning scheme (CAS) to efficiently fetch inputs and weights from on-chip SRAM buffers and a channel-first and pixel-last tiling (CFPL) scheme to maximize the utilization of multipliers on various kernel sizes. A test chip was implemented in 28-nm CMOS technology, and the experimen...
Quantized neural networks are well known for reducing latency, power consumption, and model size wit...
As AI applications become more prevalent and powerful, the performance of deep learning neural netwo...
none4siIEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS PROCEEDINGS 0271-4302 E079855Binary Neur...
Over the last ten years, the rise of deep learning has redefined the state-of-the-art in many comput...
This electronic version was submitted by the student author. The certified thesis is available in th...
The current trend for deep learning has come with an enormous computational need for billions of Mul...
Abstract Model quantization is a widely used technique to compress and accelerate deep neural netwo...
Real-time inference of deep convolutional neural networks (CNNs) on embedded systems and SoCs would ...
With the surging popularity of edge computing, the need to efficiently perform neural network infere...
Real-time inference of deep convolutional neural networks (CNNs) on embedded systems and SoCs would ...
Owing to the presence of large values, which we call outliers, conventional methods of quantization ...
Neural networks are a subset of machine learning that are currently rapidly being deployed for vario...
Recently, there has been a push to perform deep learning (DL) computations on the edge rather than t...
Heavily quantized fixed-point arithmetic is becoming a common approach to deploy Convolutional Neura...
The current trend for deep learning has come with an enormous computational need for billions of Mul...
Quantized neural networks are well known for reducing latency, power consumption, and model size wit...
As AI applications become more prevalent and powerful, the performance of deep learning neural netwo...
none4siIEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS PROCEEDINGS 0271-4302 E079855Binary Neur...
Over the last ten years, the rise of deep learning has redefined the state-of-the-art in many comput...
This electronic version was submitted by the student author. The certified thesis is available in th...
The current trend for deep learning has come with an enormous computational need for billions of Mul...
Abstract Model quantization is a widely used technique to compress and accelerate deep neural netwo...
Real-time inference of deep convolutional neural networks (CNNs) on embedded systems and SoCs would ...
With the surging popularity of edge computing, the need to efficiently perform neural network infere...
Real-time inference of deep convolutional neural networks (CNNs) on embedded systems and SoCs would ...
Owing to the presence of large values, which we call outliers, conventional methods of quantization ...
Neural networks are a subset of machine learning that are currently rapidly being deployed for vario...
Recently, there has been a push to perform deep learning (DL) computations on the edge rather than t...
Heavily quantized fixed-point arithmetic is becoming a common approach to deploy Convolutional Neura...
The current trend for deep learning has come with an enormous computational need for billions of Mul...
Quantized neural networks are well known for reducing latency, power consumption, and model size wit...
As AI applications become more prevalent and powerful, the performance of deep learning neural netwo...
none4siIEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS PROCEEDINGS 0271-4302 E079855Binary Neur...