The paper briefly reviews several recent results on hierarchical architectures for learning from examples, that may formally explain the conditions under which Deep Convolutional Neural Networks perform much better in function approximation problems than shallow, one-hidden layer architectures. The paper announces new results for a non-smooth activation function — the ReLU function — used in present-day neural networks, as well as for the Gaussian networks. We propose a new definition of relative dimension to encapsulate different notions of sparsity of a function class that can possibly be exploited by deep networks but not by shallow ones to drastically reduce the complexity required for approximation and learning
This paper focuses on establishing $L^2$ approximation properties for deep ReLU convolutional neural...
The paper characterizes classes of functions for which deep learning can be exponentially better tha...
The first part of this thesis develops fundamental limits of deep neural network learning by charact...
The paper briefly reviews several recent results on hierarchical architectures for learning from exa...
The paper reviews and extends an emerging body of theoretical results on deep learning including the...
Deep learning networks with convolution, pooling and subsampling are a special case of hierar- chica...
We describe computational tasks - especially in vision - that correspond to compositional/hierarchic...
Recently there has been much interest in understanding why deep neural networks are preferred to sha...
The paper characterizes classes of functions for which deep learning can be exponentially better tha...
While the universal approximation property holds both for hierarchical and shallow networks, deep ne...
© 2020 American Institute of Mathematical Sciences. All rights reserved. We show that deep networks ...
We propose an optimal architecture for deep neural networks of given size. The optimal architecture ...
We contribute to a better understanding of the class of functions that can be represented by a neura...
International audienceWe study the expressivity of deep neural networks. Measuring a network's compl...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
This paper focuses on establishing $L^2$ approximation properties for deep ReLU convolutional neural...
The paper characterizes classes of functions for which deep learning can be exponentially better tha...
The first part of this thesis develops fundamental limits of deep neural network learning by charact...
The paper briefly reviews several recent results on hierarchical architectures for learning from exa...
The paper reviews and extends an emerging body of theoretical results on deep learning including the...
Deep learning networks with convolution, pooling and subsampling are a special case of hierar- chica...
We describe computational tasks - especially in vision - that correspond to compositional/hierarchic...
Recently there has been much interest in understanding why deep neural networks are preferred to sha...
The paper characterizes classes of functions for which deep learning can be exponentially better tha...
While the universal approximation property holds both for hierarchical and shallow networks, deep ne...
© 2020 American Institute of Mathematical Sciences. All rights reserved. We show that deep networks ...
We propose an optimal architecture for deep neural networks of given size. The optimal architecture ...
We contribute to a better understanding of the class of functions that can be represented by a neura...
International audienceWe study the expressivity of deep neural networks. Measuring a network's compl...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
This paper focuses on establishing $L^2$ approximation properties for deep ReLU convolutional neural...
The paper characterizes classes of functions for which deep learning can be exponentially better tha...
The first part of this thesis develops fundamental limits of deep neural network learning by charact...