The volume of the distribution of weight sets associated with a loss value may be the source of implicit regularization from overparameterization due to the phenomenon of contracting volume with increasing dimensions for geometric figures demonstrated by hyperspheres. We introduce the geometric regularization conjecture and extract to an explanation for the double descent phenomenon by considering a similar property resulting from shrinking intrinsic dimensionality of the distribution of potential weight set updates available along training path, where if that distribution retracts across a volume verses dimensionality curve peak when approaching the global minima we could expect geometric regularization to re-emerge. We illustrate how data...
Neural networks are more expressive when they have multiple layers. In turn, conventional training m...
Thesis (Ph.D.)--Boston UniversityHigh dimensional inference is motivated by many real life problems ...
Finding the optimal size of deep learning models is very actual and of broad impact, especially in e...
The risk of overparameterized models, in particular deep neural networks, is often double-descent sh...
Likelihood-based, or explicit, deep generative models use neural networks to construct flexible high...
Injecting noise within gradient descent has several desirable features. In this paper, we explore no...
In many contexts, simpler models are preferable to more complex models and the control of this model...
Abstract — We consider the problem of nonlinear dimensionality reduction: given a training set of hi...
Manifold regularization is an approach which exploits the geometry of the marginal distribution. The...
<p>A) A two-dimensional example illustrate how a two-class classification between the two data sets ...
Large neural networks have proved remarkably effective in modern deep learning practice, even in the...
Deep networks are typically trained with many more parameters than the size of the training dataset....
Given a cloud of $n$ data points in $\mathbb{R}^d$, consider all projections onto $m$-dimensional su...
In this thesis, we study the overfitting problem in supervised learning of classifiers from a geomet...
Several regularization methods have recently been introduced which force the latent activations of a...
Neural networks are more expressive when they have multiple layers. In turn, conventional training m...
Thesis (Ph.D.)--Boston UniversityHigh dimensional inference is motivated by many real life problems ...
Finding the optimal size of deep learning models is very actual and of broad impact, especially in e...
The risk of overparameterized models, in particular deep neural networks, is often double-descent sh...
Likelihood-based, or explicit, deep generative models use neural networks to construct flexible high...
Injecting noise within gradient descent has several desirable features. In this paper, we explore no...
In many contexts, simpler models are preferable to more complex models and the control of this model...
Abstract — We consider the problem of nonlinear dimensionality reduction: given a training set of hi...
Manifold regularization is an approach which exploits the geometry of the marginal distribution. The...
<p>A) A two-dimensional example illustrate how a two-class classification between the two data sets ...
Large neural networks have proved remarkably effective in modern deep learning practice, even in the...
Deep networks are typically trained with many more parameters than the size of the training dataset....
Given a cloud of $n$ data points in $\mathbb{R}^d$, consider all projections onto $m$-dimensional su...
In this thesis, we study the overfitting problem in supervised learning of classifiers from a geomet...
Several regularization methods have recently been introduced which force the latent activations of a...
Neural networks are more expressive when they have multiple layers. In turn, conventional training m...
Thesis (Ph.D.)--Boston UniversityHigh dimensional inference is motivated by many real life problems ...
Finding the optimal size of deep learning models is very actual and of broad impact, especially in e...