Sparsity has become one of the promising methods to compress and accelerate Deep Neural Networks (DNNs). Among different categories of sparsity, structured sparsity has gained more attention due to its efficient execution on modern accelerators. Particularly, N:M sparsity is attractive because there are already hardware accelerator architectures that can leverage certain forms of N:M structured sparsity to yield higher compute-efficiency. In this work, we focus on N:M sparsity and extensively study and evaluate various training recipes for N:M sparsity in terms of the trade-off between model accuracy and compute cost (FLOPs). Building upon this study, we propose two new decay-based pruning methods, namely "pruning mask decay" and "sparse st...
Neural models are known to be over-parameterized, and recent work has shown that sparse text-to-spee...
In deep learning, fine-grained N:M sparsity reduces the data footprint and bandwidth of a General Ma...
We consider the problem of accurate sparse fine-tuning of large language models (LLMs), that is, fin...
The growing energy and performance costs of deep learning have driven the community to reduce the si...
The training of sparse neural networks is becoming an increasingly important tool for reducing the ...
The pre-training and fine-tuning paradigm has contributed to a number of breakthroughs in Natural La...
Overparameterized neural networks generalize well but are expensive to train. Ideally, one would lik...
With the dramatically increased number of parameters in language models, sparsity methods have recei...
Neural machine translation (NMT) strongly outperforms previous statistical techniques. With the eme...
The growing size of neural language models has led to increased attention in model compression. The ...
Deep learning has been empirically successful in recent years thanks to the extremely over-parameter...
Model sparsification in deep learning promotes simpler, more interpretable models with fewer paramet...
Sparse training has received an upsurging interest in machine learning due to its tantalizing saving...
Introducing sparsity in a neural network has been an efficient way to reduce its complexity while ke...
Image restoration tasks have witnessed great performance improvement in recent years by developing l...
Neural models are known to be over-parameterized, and recent work has shown that sparse text-to-spee...
In deep learning, fine-grained N:M sparsity reduces the data footprint and bandwidth of a General Ma...
We consider the problem of accurate sparse fine-tuning of large language models (LLMs), that is, fin...
The growing energy and performance costs of deep learning have driven the community to reduce the si...
The training of sparse neural networks is becoming an increasingly important tool for reducing the ...
The pre-training and fine-tuning paradigm has contributed to a number of breakthroughs in Natural La...
Overparameterized neural networks generalize well but are expensive to train. Ideally, one would lik...
With the dramatically increased number of parameters in language models, sparsity methods have recei...
Neural machine translation (NMT) strongly outperforms previous statistical techniques. With the eme...
The growing size of neural language models has led to increased attention in model compression. The ...
Deep learning has been empirically successful in recent years thanks to the extremely over-parameter...
Model sparsification in deep learning promotes simpler, more interpretable models with fewer paramet...
Sparse training has received an upsurging interest in machine learning due to its tantalizing saving...
Introducing sparsity in a neural network has been an efficient way to reduce its complexity while ke...
Image restoration tasks have witnessed great performance improvement in recent years by developing l...
Neural models are known to be over-parameterized, and recent work has shown that sparse text-to-spee...
In deep learning, fine-grained N:M sparsity reduces the data footprint and bandwidth of a General Ma...
We consider the problem of accurate sparse fine-tuning of large language models (LLMs), that is, fin...