Regularized nonlinear acceleration (RNA) is a generic extrapolation scheme for optimization methods, with marginal computational overhead. It aims to improve convergence using only the iterates of simple iterative algorithms. However, so far its application to optimization was theoretically limited to gradient descent and other single-step algorithms. Here, we adapt RNA to a much broader setting including stochastic gradient with momentum and Nesterov's fast gradient. We use it to train deep neural networks, and empirically observe that extrapolated networks are more accurate, especially in the early iterations. A straightforward application of our algorithm when training ResNet-152 on ImageNet produces a top-1 test error of 20.88%, improvi...
Gradient-based methods are often used for optimization. They form the basis of several neural networ...
This thesis aims at developing efficient algorithms for solving some fundamental engineering problem...
This thesis aims at developing efficient algorithms for solving some fundamental engineering problem...
Regularized nonlinear acceleration (RNA) is a generic extrapolation scheme for optimization methods,...
Regularized nonlinear acceleration (RNA) is a generic extrapolation scheme for optimization methods,...
Regularized nonlinear acceleration (RNA) is a generic extrapolation scheme for optimization methods,...
Regularized nonlinear acceleration (RNA) is a generic extrapolation scheme for optimization methods,...
International audienceThe Regularized Nonlinear Acceleration (RNA) algorithm is an acceleration meth...
International audienceThe Regularized Nonlinear Acceleration (RNA) algorithm is an acceleration meth...
International audienceThe Regularized Nonlinear Acceleration (RNA) algorithm is an acceleration meth...
International audienceThe Regularized Nonlinear Acceleration (RNA) algorithm is an acceleration meth...
International audienceThe Regularized Nonlinear Acceleration (RNA) algorithm is an acceleration meth...
Deep learning networks are typically trained by Stochastic Gradient Descent (SGD) methods that itera...
In the recent decade, deep neural networks have solved ever more complex tasks across many fronts in...
Gradient-based methods are often used for optimization. They form the basis of several neural networ...
Gradient-based methods are often used for optimization. They form the basis of several neural networ...
This thesis aims at developing efficient algorithms for solving some fundamental engineering problem...
This thesis aims at developing efficient algorithms for solving some fundamental engineering problem...
Regularized nonlinear acceleration (RNA) is a generic extrapolation scheme for optimization methods,...
Regularized nonlinear acceleration (RNA) is a generic extrapolation scheme for optimization methods,...
Regularized nonlinear acceleration (RNA) is a generic extrapolation scheme for optimization methods,...
Regularized nonlinear acceleration (RNA) is a generic extrapolation scheme for optimization methods,...
International audienceThe Regularized Nonlinear Acceleration (RNA) algorithm is an acceleration meth...
International audienceThe Regularized Nonlinear Acceleration (RNA) algorithm is an acceleration meth...
International audienceThe Regularized Nonlinear Acceleration (RNA) algorithm is an acceleration meth...
International audienceThe Regularized Nonlinear Acceleration (RNA) algorithm is an acceleration meth...
International audienceThe Regularized Nonlinear Acceleration (RNA) algorithm is an acceleration meth...
Deep learning networks are typically trained by Stochastic Gradient Descent (SGD) methods that itera...
In the recent decade, deep neural networks have solved ever more complex tasks across many fronts in...
Gradient-based methods are often used for optimization. They form the basis of several neural networ...
Gradient-based methods are often used for optimization. They form the basis of several neural networ...
This thesis aims at developing efficient algorithms for solving some fundamental engineering problem...
This thesis aims at developing efficient algorithms for solving some fundamental engineering problem...