Neural networks are increasingly being used as components in safety-critical applications, for instance, as controllers in embedded systems. Their formal safety verification has made significant progress but typically considers only idealized real-valued networks. For practical applications, such neural networks have to be quantized, i.e., implemented in finite-precision arithmetic, which inevitably introduces roundoff errors. Choosing a suitable precision that is both guaranteed to satisfy a roundoff error bound to ensure safety and that is as small as possible to not waste resources is highly nontrivial to do manually. This task is especially challenging when quantizing a neural network in fixed-point arithmetic, where one can choose amon...
Quantization of neural networks has been one of the most popular techniques to compress models for e...
Recent successes of deep learning have been achieved at the expense of a very high computational and...
Artificial Neural Networks (NNs) can effectively be used to solve many classification and regression...
Neural networks are increasingly being used as components in safety-critical applications, for insta...
International audienceDeep neural networks (DNNs) have been successfully applied to the approximatio...
To bridge the ever-increasing gap between deep neural networks' complexity and hardware capability, ...
Quantization of deep neural networks is a common way to optimize the networks for deployment on ener...
International audienceThe ever-growing cost of both training and inference for state-of-the-art neur...
Quantized neural networks are well known for reducing latency, power consumption, and model size wit...
Model quantization helps to reduce model size and latency of deep neural networks. Mixed precision q...
Recently, there has been a push to perform deep learning (DL) computations on the edge rather than t...
We consider the post-training quantization problem, which discretizes the weights of pre-trained dee...
Abstract Model quantization is a widely used technique to compress and accelerate deep neural netwo...
Quantization converts neural networks into low-bit fixed-point computations which can be carried out...
The exponentially large discrete search space in mixed-precision quantization (MPQ) makes it hard to...
Quantization of neural networks has been one of the most popular techniques to compress models for e...
Recent successes of deep learning have been achieved at the expense of a very high computational and...
Artificial Neural Networks (NNs) can effectively be used to solve many classification and regression...
Neural networks are increasingly being used as components in safety-critical applications, for insta...
International audienceDeep neural networks (DNNs) have been successfully applied to the approximatio...
To bridge the ever-increasing gap between deep neural networks' complexity and hardware capability, ...
Quantization of deep neural networks is a common way to optimize the networks for deployment on ener...
International audienceThe ever-growing cost of both training and inference for state-of-the-art neur...
Quantized neural networks are well known for reducing latency, power consumption, and model size wit...
Model quantization helps to reduce model size and latency of deep neural networks. Mixed precision q...
Recently, there has been a push to perform deep learning (DL) computations on the edge rather than t...
We consider the post-training quantization problem, which discretizes the weights of pre-trained dee...
Abstract Model quantization is a widely used technique to compress and accelerate deep neural netwo...
Quantization converts neural networks into low-bit fixed-point computations which can be carried out...
The exponentially large discrete search space in mixed-precision quantization (MPQ) makes it hard to...
Quantization of neural networks has been one of the most popular techniques to compress models for e...
Recent successes of deep learning have been achieved at the expense of a very high computational and...
Artificial Neural Networks (NNs) can effectively be used to solve many classification and regression...