Overparameterized neural networks generalize well but are expensive to train. Ideally, one would like to reduce their computational cost while retaining their generalization benefits. Sparse model training is a simple and promising approach to achieve this, but there remain challenges as existing methods struggle with accuracy loss, slow training runtime, or difficulty in sparsifying all model components. The core problem is that searching for a sparsity mask over a discrete set of sparse matrices is difficult and expensive. To address this, our main insight is to optimize over a continuous superset of sparse matrices with a fixed structure known as products of butterfly matrices. As butterfly matrices are not hardware efficient, we propose...
Recent methods in network pruning have indicated that a dense neural network involves a sparse subne...
Sparse neural networks have been widely applied to reduce the necessary resource requirements to tra...
Sparse neural networks attract increasing interest as they exhibit comparable performance to their d...
Recently, sparse training methods have started to be established as a de facto approach for training...
Sparse training has received an upsurging interest in machine learning due to its tantalizing saving...
Sparsity has become one of the promising methods to compress and accelerate Deep Neural Networks (DN...
Artificial neural networks (ANNs) have emerged as hot topics in the research community. Despite the ...
The training of sparse neural networks is becoming an increasingly important tool for reducing the ...
The growing energy and performance costs of deep learning have driven the community to reduce the si...
Sparse representation plays a critical role in vision problems, including generation and understandi...
Model sparsification in deep learning promotes simpler, more interpretable models with fewer paramet...
Deep learning has been empirically successful in recent years thanks to the extremely over-parameter...
Graph neural networks have become increasingly popular in recent years due to their ability to natur...
Recently, sparse training has emerged as a promising paradigm for efficient deep learning on edge de...
The pre-training and fine-tuning paradigm has contributed to a number of breakthroughs in Natural La...
Recent methods in network pruning have indicated that a dense neural network involves a sparse subne...
Sparse neural networks have been widely applied to reduce the necessary resource requirements to tra...
Sparse neural networks attract increasing interest as they exhibit comparable performance to their d...
Recently, sparse training methods have started to be established as a de facto approach for training...
Sparse training has received an upsurging interest in machine learning due to its tantalizing saving...
Sparsity has become one of the promising methods to compress and accelerate Deep Neural Networks (DN...
Artificial neural networks (ANNs) have emerged as hot topics in the research community. Despite the ...
The training of sparse neural networks is becoming an increasingly important tool for reducing the ...
The growing energy and performance costs of deep learning have driven the community to reduce the si...
Sparse representation plays a critical role in vision problems, including generation and understandi...
Model sparsification in deep learning promotes simpler, more interpretable models with fewer paramet...
Deep learning has been empirically successful in recent years thanks to the extremely over-parameter...
Graph neural networks have become increasingly popular in recent years due to their ability to natur...
Recently, sparse training has emerged as a promising paradigm for efficient deep learning on edge de...
The pre-training and fine-tuning paradigm has contributed to a number of breakthroughs in Natural La...
Recent methods in network pruning have indicated that a dense neural network involves a sparse subne...
Sparse neural networks have been widely applied to reduce the necessary resource requirements to tra...
Sparse neural networks attract increasing interest as they exhibit comparable performance to their d...