Large sequence to sequence models for tasks such as Neural Machine Translation (NMT) are usually trained over hundreds of millions of samples. However, training is just the origin of a model's life-cycle. Real-world deployments of models require further behavioral adaptations as new requirements emerge or shortcomings become known. Typically, in the space of model behaviors, behavior deletion requests are addressed through model retrainings whereas model finetuning is done to address behavior addition requests, both procedures being instances of data-based model intervention. In this work, we present a preliminary study investigating rank-one editing as a direct intervention method for behavior deletion requests in encoder-decoder transform...
Large language models (LLMs) have demonstrated remarkable performance across a wide array of NLP tas...
Sparsity has become one of the promising methods to compress and accelerate Deep Neural Networks (DN...
Language models learn a great quantity of factual information during pretraining, and recent work lo...
Improving machine translation (MT) by learning from human post-edits is a powerful solution that is ...
While large pre-trained models have enabled impressive results on a variety of downstream tasks, the...
With the advent of deep learning, research in many areas of machine learning is converging towards t...
We call into question the recently popularized method of direct model editing as a means of correcti...
Large pre-trained generative models are known to occasionally output undesirable samples, which unde...
Previous phrase-based approaches to Automatic Post-editing (APE) have shown that the dependency ...
Attention-based autoregressive models have achieved state-of-the-art performance in various sequence...
In a translation workflow, machine translation (MT) is almost always followed by a human post-editin...
Even the largest neural networks make errors, and once-correct predictions can become invalid as the...
This paper considers continual learning of large-scale pretrained neural machine translation model w...
Differently from the traditional statistical MT that decomposes the translation task into distinct s...
Automatic post-editing (APE) aims to reduce manual post-editing efforts by automatically correcting ...
Large language models (LLMs) have demonstrated remarkable performance across a wide array of NLP tas...
Sparsity has become one of the promising methods to compress and accelerate Deep Neural Networks (DN...
Language models learn a great quantity of factual information during pretraining, and recent work lo...
Improving machine translation (MT) by learning from human post-edits is a powerful solution that is ...
While large pre-trained models have enabled impressive results on a variety of downstream tasks, the...
With the advent of deep learning, research in many areas of machine learning is converging towards t...
We call into question the recently popularized method of direct model editing as a means of correcti...
Large pre-trained generative models are known to occasionally output undesirable samples, which unde...
Previous phrase-based approaches to Automatic Post-editing (APE) have shown that the dependency ...
Attention-based autoregressive models have achieved state-of-the-art performance in various sequence...
In a translation workflow, machine translation (MT) is almost always followed by a human post-editin...
Even the largest neural networks make errors, and once-correct predictions can become invalid as the...
This paper considers continual learning of large-scale pretrained neural machine translation model w...
Differently from the traditional statistical MT that decomposes the translation task into distinct s...
Automatic post-editing (APE) aims to reduce manual post-editing efforts by automatically correcting ...
Large language models (LLMs) have demonstrated remarkable performance across a wide array of NLP tas...
Sparsity has become one of the promising methods to compress and accelerate Deep Neural Networks (DN...
Language models learn a great quantity of factual information during pretraining, and recent work lo...