Completion of hinge loss has an implicit bias

Lizama, Justin N

Publication date

August 2022

Abstract

A new loss function is proposed which learns the hinge loss function an infinite number of times pushing $f(x_i)y_i \to \infty$. It is proven that for a linear model on linearly separable data this modified hinge loss function converges in the direction of the $\ell_2$ max-margin separator at a rate of $\bigO\left( \sqrt{d/t} \right)$ where $d$ is the dimension of the data. Then, an explicit formula for the underlying dynamical system of the gradient descent iterates for two-layer linear networks on the inner product loss function is derived. Using the derived dynamical system, a precise explicit algorithm is developed which when implemented reproduces the gradient descent iterates of two-layer ReLU nets on the inner product exactly. Thi...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Completion of hinge loss has an implicit bias

Abstract

Extracted data

Completion of hinge loss has an implicit bias

Abstract

Extracted data

Related items

Related items