In machine learning, stochastic gradient descent (SGD) is widely deployed to train models using highly non-convex objectives with equally complex noise models. Unfortunately, SGD theory often makes restrictive assumptions that fail to capture the non-convexity of real problems, and almost entirely ignore the complex noise models that exist in practice. In this work, we make substantial progress on this shortcoming. First, we establish that SGD's iterates will either globally converge to a stationary point or diverge under nearly arbitrary nonconvexity and noise models. Under a slightly more restrictive assumption on the joint behavior of the non-convexity and noise model that generalizes current assumptions in the literature, we show that t...
International audienceStochastic gradient descent (SGD) has been widely used in machine learning due...
Abstract: Stochastic gradient descent is an optimisation method that combines classical gradient des...
Stochastic Gradient Descent (SGD) type algorithms have been widely applied to many stochastic optimi...
Most existing analyses of (stochastic) gradient descent rely on the condition that for $L$-smooth co...
We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD),...
International audienceRecent studies have provided both empirical and theoretical evidence illustrat...
We study to what extent may stochastic gradient descent (SGD) be understood as a "conventional" lear...
Stochastic gradient descent (SGD) and its variants are the main workhorses for solving large-scale o...
Recent years have seen increased interest in performance guarantees of gradient descent algorithms f...
Stochastic Gradient Descent (SGD) is the workhorse beneath the deep learning revolution. However, SG...
© Springer International Publishing AG 2016. The convergence of Stochastic Gradient Descent (SGD) us...
Stochastic gradient descent (SGD) is a promising numerical method for solving large-scale inverse pr...
In this thesis, we are concerned with the Stochastic Gradient Descent (SGD) algorithm. Specifically,...
The gradient noise of Stochastic Gradient Descent (SGD) is considered to play a key role in its prop...
We prove that stochastic gradient descent (SGD) finds a solution that achieves $(1-\epsilon)$ classi...
International audienceStochastic gradient descent (SGD) has been widely used in machine learning due...
Abstract: Stochastic gradient descent is an optimisation method that combines classical gradient des...
Stochastic Gradient Descent (SGD) type algorithms have been widely applied to many stochastic optimi...
Most existing analyses of (stochastic) gradient descent rely on the condition that for $L$-smooth co...
We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD),...
International audienceRecent studies have provided both empirical and theoretical evidence illustrat...
We study to what extent may stochastic gradient descent (SGD) be understood as a "conventional" lear...
Stochastic gradient descent (SGD) and its variants are the main workhorses for solving large-scale o...
Recent years have seen increased interest in performance guarantees of gradient descent algorithms f...
Stochastic Gradient Descent (SGD) is the workhorse beneath the deep learning revolution. However, SG...
© Springer International Publishing AG 2016. The convergence of Stochastic Gradient Descent (SGD) us...
Stochastic gradient descent (SGD) is a promising numerical method for solving large-scale inverse pr...
In this thesis, we are concerned with the Stochastic Gradient Descent (SGD) algorithm. Specifically,...
The gradient noise of Stochastic Gradient Descent (SGD) is considered to play a key role in its prop...
We prove that stochastic gradient descent (SGD) finds a solution that achieves $(1-\epsilon)$ classi...
International audienceStochastic gradient descent (SGD) has been widely used in machine learning due...
Abstract: Stochastic gradient descent is an optimisation method that combines classical gradient des...
Stochastic Gradient Descent (SGD) type algorithms have been widely applied to many stochastic optimi...