We consider the idealized setting of gradient flow on the population risk for infinitely wide two-layer ReLU neural networks (without bias), and study the effect of symmetries on the learned parameters and predictors. We first describe a general class of symmetries which, when satisfied by the target function $f^*$ and the input distribution, are preserved by the dynamics. We then study more specific cases. When $f^*$ is odd, we show that the dynamics of the predictor reduces to that of a (non-linearly parameterized) linear predictor, and its exponential convergence can be guaranteed. When $f^*$ has a low-dimensional structure, we prove that the gradient flow PDE reduces to a lower-dimensional PDE. Furthermore, we present informal and numer...
Despite a great deal of research, it is still unclear why neural networks are so susceptible to adve...
Neural networks trained via gradient descent with random initialization and without any regularizati...
We contribute to a better understanding of the class of functions that is represented by a neural ne...
Rectified linear units (ReLUs) have become the main model for the neural units in current deep learn...
Under mild assumptions, we investigate the structure of loss landscape of two-layer neural networks ...
Understanding implicit bias of gradient descent for generalization capability of ReLU networks has b...
We study the dynamics and implicit bias of gradient flow (GF) on univariate ReLU neural networks wit...
We introduce exact macroscopic on-line learning dynamics of two-layer neural networks with ReLU unit...
Recently, several studies have proven the global convergence and generalization abilities of the gra...
By applying concepts from the statistical physics of learning, we study layered neural networks of r...
We explicitly analyze the trajectories of learning near singularities in hierar-chical networks, suc...
Despite the non-convex optimization landscape, over-parametrized shallow networks are able to achiev...
Deep neural networks achieve stellar generalisation on a variety of problems, despite often being la...
Substantial work indicates that the dynamics of neural networks (NNs) is closely related to their in...
We propose structure-preserving neural-network-based numerical schemes to solve both $L^2$-gradient ...
Despite a great deal of research, it is still unclear why neural networks are so susceptible to adve...
Neural networks trained via gradient descent with random initialization and without any regularizati...
We contribute to a better understanding of the class of functions that is represented by a neural ne...
Rectified linear units (ReLUs) have become the main model for the neural units in current deep learn...
Under mild assumptions, we investigate the structure of loss landscape of two-layer neural networks ...
Understanding implicit bias of gradient descent for generalization capability of ReLU networks has b...
We study the dynamics and implicit bias of gradient flow (GF) on univariate ReLU neural networks wit...
We introduce exact macroscopic on-line learning dynamics of two-layer neural networks with ReLU unit...
Recently, several studies have proven the global convergence and generalization abilities of the gra...
By applying concepts from the statistical physics of learning, we study layered neural networks of r...
We explicitly analyze the trajectories of learning near singularities in hierar-chical networks, suc...
Despite the non-convex optimization landscape, over-parametrized shallow networks are able to achiev...
Deep neural networks achieve stellar generalisation on a variety of problems, despite often being la...
Substantial work indicates that the dynamics of neural networks (NNs) is closely related to their in...
We propose structure-preserving neural-network-based numerical schemes to solve both $L^2$-gradient ...
Despite a great deal of research, it is still unclear why neural networks are so susceptible to adve...
Neural networks trained via gradient descent with random initialization and without any regularizati...
We contribute to a better understanding of the class of functions that is represented by a neural ne...