In this thesis, we study different theoretical aspects of deep learning, in particular optimization, robustness, and approximation. Optimization: We study the optimization landscape of the empirical risk of deep linear neural networks with the square loss. It is known that, under weak assumptions, there are no spurious local minima and no local maxima. However, the existence and diversity of non-strict saddle points, which can play a role in first-order algorithms' dynamics, have only been lightly studied. We go a step further with a full analysis of the optimization landscape at order 2. We characterize, among all critical points, which are global minimizers, strict saddle points, and non-strict saddle points. We enumerate all the associat...
Recent theoretical works on over-parameterized neural nets have focused on two aspects: optimization...
The first part of this thesis aims at exploring deep kernel architectures for complex data. One of t...
A main puzzle of deep networks revolves around the absence of overfitting despite overparametrizatio...
Dans cette thèse, nous étudions différents aspects théoriques de l'apprentissage profond, en particu...
The current challenge of Deep Learning is no longer the computational power nor its scope of applica...
International audienceImposing orthogonality on the layers of neural networks is known to facilitate...
Artificial neural networks are at the core of recent advances in Artificial Intelligence. One of the...
The increased availability of large amounts of data, from images in social networks, speech waveform...
The general features of the optimization problem for the case of overparametrized nonlinear networks...
This thesis develops and studies some principled methods for Deep Learning (DL) and deep Reinforceme...
We propose an optimal architecture for deep neural networks of given size. The optimal architecture ...
Recently, deep Convolutional Neural Networks (CNNs) have proven to be successful when employed in ar...
1This work seeks to answer the question: as the (near-) orthogonality of weights is found to be a fa...
Since 2006, deep learning algorithms which rely on deep architectures with several layers of increas...
Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) have been broadly used in ...
Recent theoretical works on over-parameterized neural nets have focused on two aspects: optimization...
The first part of this thesis aims at exploring deep kernel architectures for complex data. One of t...
A main puzzle of deep networks revolves around the absence of overfitting despite overparametrizatio...
Dans cette thèse, nous étudions différents aspects théoriques de l'apprentissage profond, en particu...
The current challenge of Deep Learning is no longer the computational power nor its scope of applica...
International audienceImposing orthogonality on the layers of neural networks is known to facilitate...
Artificial neural networks are at the core of recent advances in Artificial Intelligence. One of the...
The increased availability of large amounts of data, from images in social networks, speech waveform...
The general features of the optimization problem for the case of overparametrized nonlinear networks...
This thesis develops and studies some principled methods for Deep Learning (DL) and deep Reinforceme...
We propose an optimal architecture for deep neural networks of given size. The optimal architecture ...
Recently, deep Convolutional Neural Networks (CNNs) have proven to be successful when employed in ar...
1This work seeks to answer the question: as the (near-) orthogonality of weights is found to be a fa...
Since 2006, deep learning algorithms which rely on deep architectures with several layers of increas...
Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) have been broadly used in ...
Recent theoretical works on over-parameterized neural nets have focused on two aspects: optimization...
The first part of this thesis aims at exploring deep kernel architectures for complex data. One of t...
A main puzzle of deep networks revolves around the absence of overfitting despite overparametrizatio...