Finding neural network weights that generalize well from small datasets is difficult. A promising approach is to learn a weight initialization such that a small number of weight changes results in low generalization error. We show that this form of meta-learning can be improved by letting the learning algorithm decide which weights to change, i.e., by learning where to learn. We find that patterned sparsity emerges from this process, with the pattern of sparsity varying on a problem-by-problem basis. This selective sparsity results in better generalization and less interference in a range of few-shot and continual learning problems. Moreover, we find that sparse learning also emerges in a more expressive model where learning rates are meta-...
Deep learning has achieved classification performance matching or exceeding the human one, as long a...
Despite huge progress in artificial intelligence, the ability to quickly learn from few examples is ...
Continual learning (CL) refers to the ability of an intelligent system to sequentially acquire and r...
Finding neural network weights that generalize well from small datasets is difficult. A promising ap...
Recent developments in few-shot learning have shown that during fast adaption, gradient-based meta-l...
The growing energy and performance costs of deep learning have driven the community to reduce the si...
Meta-learning algorithms leverage regularities that are present on a set of tasks to speed up and im...
Given the "right" representation, learning is easy. This thesis studies representation learning and ...
The training of sparse neural networks is becoming an increasingly important tool for reducing the ...
Gradient-based meta-learning techniques aim to distill useful prior knowledge from a set of training...
In this work we provide an analysis of the distribution of the post-adaptation parameters of Gradien...
Optimization-based meta-learning aims to learn an initialization so that a new unseen task can be le...
Dans ce mémoire, nous étudions la généralisation des réseaux de neurones dans le contexte du méta-ap...
Meta-learning is critical for a variety of practical ML systems -- like personalized recommendations...
We investigate filter level sparsity that emerges in convolutional neural networks (CNNs) which empl...
Deep learning has achieved classification performance matching or exceeding the human one, as long a...
Despite huge progress in artificial intelligence, the ability to quickly learn from few examples is ...
Continual learning (CL) refers to the ability of an intelligent system to sequentially acquire and r...
Finding neural network weights that generalize well from small datasets is difficult. A promising ap...
Recent developments in few-shot learning have shown that during fast adaption, gradient-based meta-l...
The growing energy and performance costs of deep learning have driven the community to reduce the si...
Meta-learning algorithms leverage regularities that are present on a set of tasks to speed up and im...
Given the "right" representation, learning is easy. This thesis studies representation learning and ...
The training of sparse neural networks is becoming an increasingly important tool for reducing the ...
Gradient-based meta-learning techniques aim to distill useful prior knowledge from a set of training...
In this work we provide an analysis of the distribution of the post-adaptation parameters of Gradien...
Optimization-based meta-learning aims to learn an initialization so that a new unseen task can be le...
Dans ce mémoire, nous étudions la généralisation des réseaux de neurones dans le contexte du méta-ap...
Meta-learning is critical for a variety of practical ML systems -- like personalized recommendations...
We investigate filter level sparsity that emerges in convolutional neural networks (CNNs) which empl...
Deep learning has achieved classification performance matching or exceeding the human one, as long a...
Despite huge progress in artificial intelligence, the ability to quickly learn from few examples is ...
Continual learning (CL) refers to the ability of an intelligent system to sequentially acquire and r...