Large pre-trained, zero-shot capable models have shown considerable success both for standard transfer and adaptation tasks, with particular robustness towards distribution shifts. In addition, subsequent fine-tuning can considerably improve performance on a selected downstream task. However, through naive fine-tuning, these zero-shot models lose their generalizability and robustness towards distribution shifts. This is a particular problem for tasks such as Continual Learning (CL), where continuous adaptation has to be performed as new task distributions are introduced sequentially. In this work, we showcase that where fine-tuning falls short to adapt such zero-shot capable models, simple momentum-based weight interpolation can provide con...
Continual learning is a framework of learning in which we aim to move beyond the limitations of stan...
The Contrastive Language-Image Pre-training (CLIP) Model is a recently proposed large-scale pre-trai...
Continual learning necessitates the continual adaptation of models to newly emerging tasks while min...
Large pre-trained models such as CLIP or ALIGN offer consistent accuracy across a range of data dist...
Deep neural networks have shown remarkable performance when trained on independent and identically d...
Recently, continual learning (CL) has gained significant interest because it enables deep learning m...
Continual learning entails learning a sequence of tasks and balancing their knowledge appropriately....
Deep learning has enjoyed tremendous success over the last decade, but the training of practically u...
This paper argues that continual learning methods can benefit by splitting the capacity of the learn...
Approaches to continual learning aim to successfully learn a set of related tasks that arrive in an ...
Work on continual learning (CL) has largely focused on the problems arising from the dynamically-cha...
In continual learning (CL), the goal is to design models that can learn a sequence of tasks without ...
This paper considers continual learning of large-scale pretrained neural machine translation model w...
We study a practical setting of continual learning: fine-tuning on a pre-trained model continually. ...
Online continual learning aims to get closer to a live learning experience by learning directly on a...
Continual learning is a framework of learning in which we aim to move beyond the limitations of stan...
The Contrastive Language-Image Pre-training (CLIP) Model is a recently proposed large-scale pre-trai...
Continual learning necessitates the continual adaptation of models to newly emerging tasks while min...
Large pre-trained models such as CLIP or ALIGN offer consistent accuracy across a range of data dist...
Deep neural networks have shown remarkable performance when trained on independent and identically d...
Recently, continual learning (CL) has gained significant interest because it enables deep learning m...
Continual learning entails learning a sequence of tasks and balancing their knowledge appropriately....
Deep learning has enjoyed tremendous success over the last decade, but the training of practically u...
This paper argues that continual learning methods can benefit by splitting the capacity of the learn...
Approaches to continual learning aim to successfully learn a set of related tasks that arrive in an ...
Work on continual learning (CL) has largely focused on the problems arising from the dynamically-cha...
In continual learning (CL), the goal is to design models that can learn a sequence of tasks without ...
This paper considers continual learning of large-scale pretrained neural machine translation model w...
We study a practical setting of continual learning: fine-tuning on a pre-trained model continually. ...
Online continual learning aims to get closer to a live learning experience by learning directly on a...
Continual learning is a framework of learning in which we aim to move beyond the limitations of stan...
The Contrastive Language-Image Pre-training (CLIP) Model is a recently proposed large-scale pre-trai...
Continual learning necessitates the continual adaptation of models to newly emerging tasks while min...