A crucial aspect in designing a learning algorithm is the selection of the hyperparameters (parameters that are not trained during the learning process). In particular the effectiveness of the stochastic gradient methods strongly depends on the steplength selection. In recent papers [9, 10], Franchini et al. propose to adopt an adaptive selection rule borrowed from the full-gradient scheme known as Limited Memory Steepest Descent method [8] and appropriately tailored to the stochastic framework. This strategy is based on the computation of the eigenvalues (Ritz-like values) of a suitable matrix obtained from the gradients of the most recent iterations, and it enables to give an estimation of the local Lipschitz constant of the current gradi...
We design step-size schemes that make stochastic gradient descent (SGD) adaptive to (i) the noise σ ...
Gradient-based methods are often used for optimization. They form the basis of several neural networ...
This paper deals with gradient methods for minimizing n-dimensional strictly convex quadratic functi...
A crucial aspect in designing a learning algorithm is the selection of the hyperparameters (paramete...
The steplength selection is a crucial issue for the effectiveness of the stochastic gradient methods...
This paper deals with the steplength selection in stochastic gradient methods for large scale optimi...
The steplength selection is a crucial issue for the effectiveness of the stochastic gradient methods...
In recent years several proposals for the step-size selection have largely improved the gradient met...
Current machine learning practice requires solving huge-scale empirical risk minimization problems q...
The seminal paper by Barzilai and Borwein (1988) has given rise to an extensive investigation, leadi...
Gradient methods are frequently used in large scale image deblurring problems since they avoid the o...
Traditionally, stochastic approximation (SA) schemes have been popular choices for solving stochasti...
Recent years have witnessed huge advances in machine learning (ML) and its applications, especially ...
Noise is inherited in many optimization methods such as stochastic gradient methods, zeroth-order me...
We design step-size schemes that make stochastic gradient descent (SGD) adaptive to (i) the noise σ ...
Gradient-based methods are often used for optimization. They form the basis of several neural networ...
This paper deals with gradient methods for minimizing n-dimensional strictly convex quadratic functi...
A crucial aspect in designing a learning algorithm is the selection of the hyperparameters (paramete...
The steplength selection is a crucial issue for the effectiveness of the stochastic gradient methods...
This paper deals with the steplength selection in stochastic gradient methods for large scale optimi...
The steplength selection is a crucial issue for the effectiveness of the stochastic gradient methods...
In recent years several proposals for the step-size selection have largely improved the gradient met...
Current machine learning practice requires solving huge-scale empirical risk minimization problems q...
The seminal paper by Barzilai and Borwein (1988) has given rise to an extensive investigation, leadi...
Gradient methods are frequently used in large scale image deblurring problems since they avoid the o...
Traditionally, stochastic approximation (SA) schemes have been popular choices for solving stochasti...
Recent years have witnessed huge advances in machine learning (ML) and its applications, especially ...
Noise is inherited in many optimization methods such as stochastic gradient methods, zeroth-order me...
We design step-size schemes that make stochastic gradient descent (SGD) adaptive to (i) the noise σ ...
Gradient-based methods are often used for optimization. They form the basis of several neural networ...
This paper deals with gradient methods for minimizing n-dimensional strictly convex quadratic functi...