Spartan: Differentiable Sparsity via Regularized Transportation

Tai, Kai Sheng
Tian, Taipeng
Lim, Ser-Nam

Publication date

May 2022

Abstract

We present Spartan, a method for training sparse neural network models with a predetermined level of sparsity. Spartan is based on a combination of two techniques: (1) soft top-k masking of low-magnitude parameters via a regularized optimal transportation problem and (2) dual averaging-based parameter updates with hard sparsification in the forward pass. This scheme realizes an exploration-exploitation tradeoff: early in training, the learner is able to explore various sparsity patterns, and as the soft top-k approximation is gradually sharpened over the course of training, the balance shifts towards parameter optimization with respect to a fixed sparsity mask. Spartan is sufficiently flexible to accommodate a variety of sparsity allocation...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Spartan: Differentiable Sparsity via Regularized Transportation

Abstract

Extracted data

Spartan: Differentiable Sparsity via Regularized Transportation

Abstract

Extracted data

Related items

Related items