Data-free quantization is a task that compresses the neural network to low bit-width without access to original training data. Most existing data-free quantization methods cause severe performance degradation due to inaccurate activation clipping range and quantization error, especially for low bit-width. In this paper, we present a simple yet effective data-free quantization method with accurate activation clipping and adaptive batch normalization. Accurate activation clipping (AAC) improves the model accuracy by exploiting accurate activation information from the full-precision model. Adaptive batch normalization firstly proposes to address the quantization error from distribution changes by updating the batch normalization layer adaptive...
We explore calibration properties at various precisions for three architectures: ShuffleNetv2, Ghost...
One-bit quantization is a general tool to execute a complex model,such as deep neural networks, on a...
Batch Normalization (BN) (Ioffe and Szegedy 2015) normalizes the features of an input image via stat...
While post-training quantization receives popularity mostly due to its evasion in accessing the orig...
Data-free quantization aims to achieve model quantization without accessing any authentic sample. It...
Quantization is a promising approach for reducing the inference time and memory footprint of neural ...
At present, the quantification methods of neural network models are mainly divided into post-trainin...
The neural network quantization is highly desired procedure to perform before running neural network...
While neural networks have been remarkably successful in a wide array of applications, implementing ...
Zero-shot quantization is a promising approach for developing lightweight deep neural networks when ...
Quantization of the weights and activations is one of the main methods to reduce the computational f...
Data clipping is crucial in reducing noise in quantization operations and improving the achievable a...
Recent advancements in machine learning achieved by Deep Neural Networks (DNNs) have been significan...
In deep neural networks (DNNs), there are a huge number of weights and multiply-and-accumulate (MAC)...
When training neural networks with simulated quantization, we observe that quantized weights can, ra...
We explore calibration properties at various precisions for three architectures: ShuffleNetv2, Ghost...
One-bit quantization is a general tool to execute a complex model,such as deep neural networks, on a...
Batch Normalization (BN) (Ioffe and Szegedy 2015) normalizes the features of an input image via stat...
While post-training quantization receives popularity mostly due to its evasion in accessing the orig...
Data-free quantization aims to achieve model quantization without accessing any authentic sample. It...
Quantization is a promising approach for reducing the inference time and memory footprint of neural ...
At present, the quantification methods of neural network models are mainly divided into post-trainin...
The neural network quantization is highly desired procedure to perform before running neural network...
While neural networks have been remarkably successful in a wide array of applications, implementing ...
Zero-shot quantization is a promising approach for developing lightweight deep neural networks when ...
Quantization of the weights and activations is one of the main methods to reduce the computational f...
Data clipping is crucial in reducing noise in quantization operations and improving the achievable a...
Recent advancements in machine learning achieved by Deep Neural Networks (DNNs) have been significan...
In deep neural networks (DNNs), there are a huge number of weights and multiply-and-accumulate (MAC)...
When training neural networks with simulated quantization, we observe that quantized weights can, ra...
We explore calibration properties at various precisions for three architectures: ShuffleNetv2, Ghost...
One-bit quantization is a general tool to execute a complex model,such as deep neural networks, on a...
Batch Normalization (BN) (Ioffe and Szegedy 2015) normalizes the features of an input image via stat...