PTQ4ViT: Post-Training Quantization Framework for Vision Transformers with Twin Uniform Quantization

Yuan, Zhihang
Xue, Chenhao
Chen, Yiqi
Wu, Qiang
Sun, Guangyu

Publication date

July 2022

Language

English

Abstract

Quantization is one of the most effective methods to compress neural networks, which has achieved great success on convolutional neural networks (CNNs). Recently, vision transformers have demonstrated great potential in computer vision. However, previous post-training quantization methods performed not well on vision transformer, resulting in more than 1% accuracy drop even in 8-bit quantization. Therefore, we analyze the problems of quantization on vision transformers. We observe the distributions of activation values after softmax and GELU functions are quite different from the Gaussian distribution. We also observe that common quantization metrics, such as MSE and cosine distance, are inaccurate to determine the optimal scaling factor. I...

Extracted data

We use cookies to provide a better user experience.

Data Protection

PTQ4ViT: Post-Training Quantization Framework for Vision Transformers with Twin Uniform Quantization

Abstract

Extracted data

PTQ4ViT: Post-Training Quantization Framework for Vision Transformers with Twin Uniform Quantization

Abstract

Extracted data

Related items

Related items