MobileNets Can Be Lossily Compressed: Neural Network Compression for Embedded Accelerators

Se-Min Lim
Sang-Woo Jun

Open link

Publication date

March 2022

DOI

10.3390/electronics11060858

Publisher

MDPI AG

Journal

Electronics

Abstract

Although neural network quantization is an imperative technology for the computation and memory efficiency of embedded neural network accelerators, simple post-training quantization incurs unacceptable levels of accuracy degradation on some important models targeting embedded systems, such as MobileNets. While explicit quantization-aware training or re-training after quantization can often reclaim lost accuracy, this is not always possible or convenient. We present an alternative approach to compressing such difficult neural networks, using a novel variant of the ZFP lossy floating-point compression algorithm to compress both model weights and inter-layer activations and demonstrate that it can be efficiently implemented on an embedded FPGA...

Extracted data

We use cookies to provide a better user experience.

Data Protection

MobileNets Can Be Lossily Compressed: Neural Network Compression for Embedded Accelerators

Abstract

Extracted data

MobileNets Can Be Lossily Compressed: Neural Network Compression for Embedded Accelerators

Abstract

Extracted data

Related items

Related items