Altitude training: Strong bounds for single-layer dropout

Stefan Wager
William Fithian
Sida Wang
Percy Liang

Publication date

January 2014

Abstract

Dropout training, originally designed for deep neural networks, has been success-ful on high-dimensional single-layer natural language tasks. This paper proposes a theoretical explanation for this phenomenon: we show that, under a generative Poisson topic model with long documents, dropout training improves the exponent in the generalization bound for empirical risk minimization. Dropout achieves this gain much like a marathon runner who practices at altitude: once a classifier learns to perform reasonably well on training examples that have been artificially cor-rupted by dropout, it will do very well on the uncorrupted test set. We also show that, under similar conditions, dropout preserves the Bayes decision boundary and should therefore...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Altitude training: Strong bounds for single-layer dropout

Abstract

Extracted data

Altitude training: Strong bounds for single-layer dropout

Abstract

Extracted data

Related items

Related items