While deep neural networks are a highly successful model class, their large memory footprint puts considerable strain on energy consumption, communication bandwidth, and storage requirements. Consequently, model size reduction has become an utmost goal in deep learning. A typical approach is to train a set of deterministic weights, while applying certain techniques such as pruning and quantization, in order that the empirical weight distribution becomes amenable to Shannon-style coding schemes. However, as shown in this paper, relaxing weight determinism and using a full variational distribution over weights allows for more efficient coding schemes and consequently higher compression rates. In particular, following the classical bits-back a...
Lossy gradient compression has become a practical tool to overcome the communication bottleneck in c...
Latent variable models have been successfully applied in lossless compression with the bits-back cod...
International audienceStudies on generalization performance of machine learning algorithms under the...
While deep neural networks are a highly successful model class, their large memory footprint puts co...
While deep neural networks are a highly successful model class, their large memory footprint puts co...
Modern iterations of deep learning models contain millions (billions) of unique parameters-each repr...
The Minimum Description Length principle (MDL) can be used to train the hidden units of a neural net...
Neural networks have gained widespread use in many machine learning tasks due to their state-of-the-...
This article considers the subject of information losses arising from the finite data sets used in t...
In federated learning (FL), a global model is trained at a Parameter Server (PS) by aggregating mode...
Compressed communication, in the form of sparsification or quantization of stochastic gradients, is ...
The increase in sophistication of neural network models in recent years has exponentially expanded m...
International audienceWe tackle the problem of producing compact models, maximizing their accuracy f...
In recent years, neural networks have grown in popularity, mostly thanks to the advances in the fiel...
The success of overparameterized deep neural networks (DNNs) poses a great challenge to deploy compu...
Lossy gradient compression has become a practical tool to overcome the communication bottleneck in c...
Latent variable models have been successfully applied in lossless compression with the bits-back cod...
International audienceStudies on generalization performance of machine learning algorithms under the...
While deep neural networks are a highly successful model class, their large memory footprint puts co...
While deep neural networks are a highly successful model class, their large memory footprint puts co...
Modern iterations of deep learning models contain millions (billions) of unique parameters-each repr...
The Minimum Description Length principle (MDL) can be used to train the hidden units of a neural net...
Neural networks have gained widespread use in many machine learning tasks due to their state-of-the-...
This article considers the subject of information losses arising from the finite data sets used in t...
In federated learning (FL), a global model is trained at a Parameter Server (PS) by aggregating mode...
Compressed communication, in the form of sparsification or quantization of stochastic gradients, is ...
The increase in sophistication of neural network models in recent years has exponentially expanded m...
International audienceWe tackle the problem of producing compact models, maximizing their accuracy f...
In recent years, neural networks have grown in popularity, mostly thanks to the advances in the fiel...
The success of overparameterized deep neural networks (DNNs) poses a great challenge to deploy compu...
Lossy gradient compression has become a practical tool to overcome the communication bottleneck in c...
Latent variable models have been successfully applied in lossless compression with the bits-back cod...
International audienceStudies on generalization performance of machine learning algorithms under the...