PalQuant: Accelerating High-precision Networks on Low-precision Accelerators

Hu, Qinghao
Li, Gang
Wu, Qiman
Cheng, Jian

Publication date

August 2022

Language

English

Abstract

Recently low-precision deep learning accelerators (DLAs) have become popular due to their advantages in chip area and energy consumption, yet the low-precision quantized models on these DLAs bring in severe accuracy degradation. One way to achieve both high accuracy and efficient inference is to deploy high-precision neural networks on low-precision DLAs, which is rarely studied. In this paper, we propose the PArallel Low-precision Quantization (PalQuant) method that approximates high-precision computations via learning parallel low-precision representations from scratch. In addition, we present a novel cyclic shuffle module to boost the cross-group information communication between parallel low-precision groups. Extensive experiments demon...

Extracted data

We use cookies to provide a better user experience.

Data Protection

PalQuant: Accelerating High-precision Networks on Low-precision Accelerators

Abstract

Extracted data

PalQuant: Accelerating High-precision Networks on Low-precision Accelerators

Abstract

Extracted data

Related items

Related items