© Springer International Publishing AG 2016. The convergence of Stochastic Gradient Descent (SGD) using convex loss functions has been widely studied. However, vanilla SGD methods using convex losses cannot perform well with noisy labels, which adversely affect the update of the primal variable in SGD methods. Unfortunately, noisy labels are ubiquitous in real world applications such as crowdsourcing. To handle noisy labels, in this paper, we present a family of robust losses for SGD methods. By employing our robust losses, SGD methods successfully reduce negative effects caused by noisy labels on each update of the primal variable. We not only reveal the convergence rate of SGD methods using robust losses, but also provide the robustness a...
We develop the mathematical foundations of the stochastic modified equations (SME) framework for ana...
With a weighting scheme proportional to t, a traditional stochastic gradient descent (SGD) algorithm...
We study a scalable alternative to robust gradient descent (RGD) techniques that can be used when lo...
We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD),...
Stochastic gradient descent (SGD) is a simple and popular method to solve stochastic optimization pr...
Stochastic gradient descent (SGD) is a simple and popular method to solve stochastic optimization pr...
© 2018 IEEE. Large-scale learning problems require a plethora of labels that can be efficiently coll...
Stochastic mirror descent (SMD) algorithms have recently garnered a great deal of attention in optim...
Stochastic mirror descent (SMD) algorithms have recently garnered a great deal of attention in optim...
We study to what extent may stochastic gradient descent (SGD) be understood as a "conventional" lear...
International audienceRecent studies have provided both empirical and theoretical evidence illustrat...
We prove the convergence to minima and estimates on the rate of convergence for the stochastic gradi...
Stochastic Gradient Descent (SGD) is the workhorse beneath the deep learning revolution. However, SG...
Fehrman B, Gess B, Jentzen A. Convergence Rates for the Stochastic Gradient Descent Method for Non-C...
An influential line of recent work has focused on the generalization properties of unregularized gra...
We develop the mathematical foundations of the stochastic modified equations (SME) framework for ana...
With a weighting scheme proportional to t, a traditional stochastic gradient descent (SGD) algorithm...
We study a scalable alternative to robust gradient descent (RGD) techniques that can be used when lo...
We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD),...
Stochastic gradient descent (SGD) is a simple and popular method to solve stochastic optimization pr...
Stochastic gradient descent (SGD) is a simple and popular method to solve stochastic optimization pr...
© 2018 IEEE. Large-scale learning problems require a plethora of labels that can be efficiently coll...
Stochastic mirror descent (SMD) algorithms have recently garnered a great deal of attention in optim...
Stochastic mirror descent (SMD) algorithms have recently garnered a great deal of attention in optim...
We study to what extent may stochastic gradient descent (SGD) be understood as a "conventional" lear...
International audienceRecent studies have provided both empirical and theoretical evidence illustrat...
We prove the convergence to minima and estimates on the rate of convergence for the stochastic gradi...
Stochastic Gradient Descent (SGD) is the workhorse beneath the deep learning revolution. However, SG...
Fehrman B, Gess B, Jentzen A. Convergence Rates for the Stochastic Gradient Descent Method for Non-C...
An influential line of recent work has focused on the generalization properties of unregularized gra...
We develop the mathematical foundations of the stochastic modified equations (SME) framework for ana...
With a weighting scheme proportional to t, a traditional stochastic gradient descent (SGD) algorithm...
We study a scalable alternative to robust gradient descent (RGD) techniques that can be used when lo...