Disparity Between Batches as a Signal for Early Stopping

Forouzesh, Mahsa
Thiran, Patrick

Open link

Publication date

September 2021

DOI

10.1007/978-3-030-86520-7_14

Publisher

Springer Science and Business Media LLC

Abstract

We propose a metric for evaluating the generalization ability of deep neural networks trained with mini-batch gradient descent. Our metric, called gradient disparity, is the l2 norm distance between the gradient vectors of two mini-batches drawn from the training set. It is derived from a probabilistic upper bound on the difference between the classification errors over a given mini-batch, when the network is trained on this mini-batch and when the network is trained on another mini-batch of points sampled from the same dataset. We empirically show that gradient disparity is a very promising early-stopping criterion (i) when data is limited, as it uses all the samples for training and (ii) when available data has noisy labels, as it signals...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Disparity Between Batches as a Signal for Early Stopping

Abstract

Extracted data

Disparity Between Batches as a Signal for Early Stopping

Abstract

Extracted data

Related items

Related items