Determining the memory capacity of two layer neural networks with $m$ hidden neurons and input dimension $d$ (i.e., $md+m$ total trainable parameters), which refers to the largest size of general data the network can memorize, is a fundamental machine learning question. For polynomial activations of sufficiently high degree, such as $x^k$ with $\binom{d+k}{d-1}\ge n$, and real analytic activations, such as sigmoids and smoothed rectified linear units (smoothed ReLUs), we establish a lower bound of $\lfloor md/2\rfloor$ and optimality up to a factor of approximately 2. Analogous prior results were limited to Heaviside and ReLU activations. In order to analyze general real analytic activations, we derive the precise generic rank of the networ...
We present a model of long term memory : learning within irreversible bounds. The best bound values ...
) Wolfgang Maass* Institute for Theoretical Computer Science Technische Universitaet Graz Klosterwie...
The neural network is a powerful computing framework that has been exploited by biological evolution...
AbstractThis paper shows that neural networks which use continuous activation functions have VC dime...
We study the excess capacity of deep networks in the context of supervised classification. That is, ...
A long standing open problem in the theory of neural networks is the development of quantitative met...
Overwhelming theoretical and empirical evidence shows that mildly overparametrized neural networks -...
We propose to measure the memory capacity of a state machine by the numbers of discernible states, w...
Threshold-linear (graded response) units approximate the real firing behaviour of pyramidal neurons ...
Differentiable neural computers extend artificial neural networks with an explicit memory without in...
We consider the algorithmic problem of finding the optimal weights and biases for a two-layer fully ...
We contribute to a better understanding of the class of functions that can be represented by a neura...
© 2019 Neural information processing systems foundation. All rights reserved. We study finite sample...
In this article we present new results on neural networks with linear threshold activation functions...
How does the size of a neural circuit influence its learning performance? Larger brains tend to be f...
We present a model of long term memory : learning within irreversible bounds. The best bound values ...
) Wolfgang Maass* Institute for Theoretical Computer Science Technische Universitaet Graz Klosterwie...
The neural network is a powerful computing framework that has been exploited by biological evolution...
AbstractThis paper shows that neural networks which use continuous activation functions have VC dime...
We study the excess capacity of deep networks in the context of supervised classification. That is, ...
A long standing open problem in the theory of neural networks is the development of quantitative met...
Overwhelming theoretical and empirical evidence shows that mildly overparametrized neural networks -...
We propose to measure the memory capacity of a state machine by the numbers of discernible states, w...
Threshold-linear (graded response) units approximate the real firing behaviour of pyramidal neurons ...
Differentiable neural computers extend artificial neural networks with an explicit memory without in...
We consider the algorithmic problem of finding the optimal weights and biases for a two-layer fully ...
We contribute to a better understanding of the class of functions that can be represented by a neura...
© 2019 Neural information processing systems foundation. All rights reserved. We study finite sample...
In this article we present new results on neural networks with linear threshold activation functions...
How does the size of a neural circuit influence its learning performance? Larger brains tend to be f...
We present a model of long term memory : learning within irreversible bounds. The best bound values ...
) Wolfgang Maass* Institute for Theoretical Computer Science Technische Universitaet Graz Klosterwie...
The neural network is a powerful computing framework that has been exploited by biological evolution...