The L1-regularized models are widely used for sparse regression or classification tasks. In this paper, we propose the orthant-wise passive descent algorithm (OPDA) for solving L1-regularized models, as an improved substitute of proximal algorithms, which are the standard tools for optimizing the models nowadays. OPDA uses a stochastic variance-reduced gradient (SVRG) to initialize the descent direction, then apply a novel alignment operator to encourage each element keeping the same sign after one iteration of update, so the parameter remains in the same orthant as before. It also explicitly suppresses the magnitude of each element to impose sparsity. The quasi-Newton update can be utilized to incorporate curvature information and accelera...
Owing to their statistical properties, non-convex sparse regularizers have attracted much interest f...
We introduce a new algorithm, extended regularized dual averaging (XRDA), for solving regularized st...
We propose a new sparse model construction method aimed at maximizing a model's generalisation capab...
In this paper, we consider the problem of training structured neural networks (NN) with nonsmooth re...
Regularization plays an important role in generalization of deep learning. In this paper, we study t...
International audienceWe consider a reformulation of Reduced-Rank Regression (RRR) and Sparse Reduce...
This paper explores a new framework for reinforcement learning based on online convex optimization, ...
We investigate implicit regularization schemes for gradient descent methods applied to unpenalized l...
Beside the minimizationof the prediction error, two of the most desirable properties of a regression...
Recently, Yuan et al. (2010) conducted a comprehensive comparison on software for L1-regularized cla...
Injecting noise within gradient descent has several desirable features. In this paper, we explore no...
We propose a novel general algorithm LHAC that efficiently uses second-order information to train a ...
We study the convergence, the implicit regularization and the generalization of stochastic mirror de...
Online learning algorithms require to often recompute least squares regression estimates of paramete...
Online learning algorithms require to often recompute least squares regression estimates of paramete...
Owing to their statistical properties, non-convex sparse regularizers have attracted much interest f...
We introduce a new algorithm, extended regularized dual averaging (XRDA), for solving regularized st...
We propose a new sparse model construction method aimed at maximizing a model's generalisation capab...
In this paper, we consider the problem of training structured neural networks (NN) with nonsmooth re...
Regularization plays an important role in generalization of deep learning. In this paper, we study t...
International audienceWe consider a reformulation of Reduced-Rank Regression (RRR) and Sparse Reduce...
This paper explores a new framework for reinforcement learning based on online convex optimization, ...
We investigate implicit regularization schemes for gradient descent methods applied to unpenalized l...
Beside the minimizationof the prediction error, two of the most desirable properties of a regression...
Recently, Yuan et al. (2010) conducted a comprehensive comparison on software for L1-regularized cla...
Injecting noise within gradient descent has several desirable features. In this paper, we explore no...
We propose a novel general algorithm LHAC that efficiently uses second-order information to train a ...
We study the convergence, the implicit regularization and the generalization of stochastic mirror de...
Online learning algorithms require to often recompute least squares regression estimates of paramete...
Online learning algorithms require to often recompute least squares regression estimates of paramete...
Owing to their statistical properties, non-convex sparse regularizers have attracted much interest f...
We introduce a new algorithm, extended regularized dual averaging (XRDA), for solving regularized st...
We propose a new sparse model construction method aimed at maximizing a model's generalisation capab...