While classic studies proved that wide networks allow universal approximation, recent research and successes of deep learning demonstrate the power of deep networks. Based on a symmetric consideration, we investigate if the design of artificial neural networks should have a directional preference, and what the mechanism of interaction is between the width and depth of a network. Inspired by the De Morgan law, we address this fundamental question by establishing a quasi-equivalence between the width and depth of ReLU networks in two aspects. First, we formulate two transforms for mapping an arbitrary ReLU network to a wide network and a deep network respectively for either regression or classification so that the essentially same capability ...
The strong lottery ticket hypothesis has highlighted the potential for training deep neural networks...
We propose an optimal architecture for deep neural networks of given size. The optimal architecture ...
The paper briefly reviews several recent results on hierarchical architectures for learning from exa...
We solve an open question from Lu et al. (2017), by showing that any target network with inputs in $...
Deep learning networks with convolution, pooling and subsampling are a special case of hierar- chica...
The paper reviews and extends an emerging body of theoretical results on deep learning including the...
University of Technology Sydney. Faculty of Engineering and Information Technology.Deep learning has...
Recently there has been much interest in understanding why deep neural networks are preferred to sha...
[formerly titled "Why and When Can Deep – but Not Shallow – Networks Avoid the Curse of Dimensionali...
We contribute to a better understanding of the class of functions that can be represented by a neura...
Across scientific and engineering disciplines, the algorithmic pipeline forprocessing and understand...
This paper studies the expressive power of graph neural networks falling within the message-passing ...
In practice, multi-task learning (through learning features shared among tasks) is an essential prop...
People believe that depth plays an important role in success of deep neural networks (DNN). However,...
A remarkable characteristic of overparameterized deep neural networks (DNNs) is that their accuracy ...
The strong lottery ticket hypothesis has highlighted the potential for training deep neural networks...
We propose an optimal architecture for deep neural networks of given size. The optimal architecture ...
The paper briefly reviews several recent results on hierarchical architectures for learning from exa...
We solve an open question from Lu et al. (2017), by showing that any target network with inputs in $...
Deep learning networks with convolution, pooling and subsampling are a special case of hierar- chica...
The paper reviews and extends an emerging body of theoretical results on deep learning including the...
University of Technology Sydney. Faculty of Engineering and Information Technology.Deep learning has...
Recently there has been much interest in understanding why deep neural networks are preferred to sha...
[formerly titled "Why and When Can Deep – but Not Shallow – Networks Avoid the Curse of Dimensionali...
We contribute to a better understanding of the class of functions that can be represented by a neura...
Across scientific and engineering disciplines, the algorithmic pipeline forprocessing and understand...
This paper studies the expressive power of graph neural networks falling within the message-passing ...
In practice, multi-task learning (through learning features shared among tasks) is an essential prop...
People believe that depth plays an important role in success of deep neural networks (DNN). However,...
A remarkable characteristic of overparameterized deep neural networks (DNNs) is that their accuracy ...
The strong lottery ticket hypothesis has highlighted the potential for training deep neural networks...
We propose an optimal architecture for deep neural networks of given size. The optimal architecture ...
The paper briefly reviews several recent results on hierarchical architectures for learning from exa...