Given a pair of models with similar training set performance, it is natural to assume that the model that possesses simpler internal representations would exhibit better generalization. In this work, we provide empirical evidence for this intuition through an analysis of the intrinsic dimension (ID) of model activations, which can be thought of as the minimal number of factors of variation in the model's representation of the data. First, we show that common regularization techniques uniformly decrease the last-layer ID (LLID) of validation set activations for image classification models and show how this strongly affects generalization performance. We also investigate how excessive regularization decreases a model's ability to extract feat...
Weak supervision is leveraged in a wide range of domains and tasks due to its ability to create mass...
Deep Neural Networks (DNNs) are prone to learning spurious features that correlate with the label du...
Increasing the size of overparameterized neural networks has been shown to improve their generalizat...
A machine learning (ML) system must learn not only to match the output of a target function on a tra...
Generalization and invariance are two essential properties of any machine learning model. Generaliza...
This paper provides theoretical insights into why and how deep learning can generalize well, despite...
Machine learning (ML) robustness and domain generalization are fundamentally correlated: they essent...
This is the final version. Available from ICLR via the link in this recordDeep neural networks (DNNs...
An important component for generalization in machine learning is to uncover underlying latent factor...
Injecting noise within gradient descent has several desirable features. In this paper, we explore no...
The generalization mystery in deep learning is the following: Why do over-parameterized neural netwo...
Understanding how feature learning affects generalization is among the foremost goals of modern deep...
Performance and generalization ability are two important aspects to evaluate deep learning models. H...
Domain Generalization (DG) studies the capability of a deep learning model to generalize to out-of-t...
Gradient-based deep-learning algorithms exhibit remarkable performance in practice, but it is not we...
Weak supervision is leveraged in a wide range of domains and tasks due to its ability to create mass...
Deep Neural Networks (DNNs) are prone to learning spurious features that correlate with the label du...
Increasing the size of overparameterized neural networks has been shown to improve their generalizat...
A machine learning (ML) system must learn not only to match the output of a target function on a tra...
Generalization and invariance are two essential properties of any machine learning model. Generaliza...
This paper provides theoretical insights into why and how deep learning can generalize well, despite...
Machine learning (ML) robustness and domain generalization are fundamentally correlated: they essent...
This is the final version. Available from ICLR via the link in this recordDeep neural networks (DNNs...
An important component for generalization in machine learning is to uncover underlying latent factor...
Injecting noise within gradient descent has several desirable features. In this paper, we explore no...
The generalization mystery in deep learning is the following: Why do over-parameterized neural netwo...
Understanding how feature learning affects generalization is among the foremost goals of modern deep...
Performance and generalization ability are two important aspects to evaluate deep learning models. H...
Domain Generalization (DG) studies the capability of a deep learning model to generalize to out-of-t...
Gradient-based deep-learning algorithms exhibit remarkable performance in practice, but it is not we...
Weak supervision is leveraged in a wide range of domains and tasks due to its ability to create mass...
Deep Neural Networks (DNNs) are prone to learning spurious features that correlate with the label du...
Increasing the size of overparameterized neural networks has been shown to improve their generalizat...