PDE approach to regularization in deep learning

Oberman, Adam

Publication date

November 2017

Publisher

Banff International Research Station for Mathematical Innovation and Discovery

Abstract

Deep neural networks have achieved significant success in a number of challenging engineering problems. There is consensus in the community that some form of smoothing of the loss function is needed, and there have been hundreds of papers and many conferences in the past three years on this topic. However, so far there has been little analysis by mathematicians. The fundamental tool in training deep neural networks is Stochastic Gradient Descent (SGD) applied to the ``loss'' function, $f(x)$, which is high dimensional and nonconvex. \begin{equation}\label{SGDintro}\tag{SDG} dx_t = -\nabla f(x_t) dt + dW_t \end{equation} There is a consensus in the field that some for of regularization of the loss function is needed, but so far there...

Extracted data

We use cookies to provide a better user experience.

Data Protection

PDE approach to regularization in deep learning

Abstract

Extracted data

PDE approach to regularization in deep learning

Abstract

Extracted data

Related items

Related items