The advancement of neural network models has led to state-of-the-art performance in a wide range of NLP tasks, e.g. machine translation and question answering. Despite the remarkable performance gains, NLP systems deployed in the wild are brittle and fragile as it always experiences a shift in the data distribution: the training data distribution is different from the one at test time. For example, a multilingual machine translation model is expected to perform uniformly well across a set of language pairs while the training resources can be extremely imbalanced across different language pairs. Such distribution shift is also ubiquitous in the modern pretrain-then-fine-tune paradigm, where models pre-trained on a large text corpus are fine-...
In recent years, the field of language modelling has witnessed exciting developments. Especially, th...
Deep learning has achieved state-of-the-art performance on a wide range of tasks, including natural ...
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...
This thesis aims for general robust Neural Machine Translation (NMT) that is agnostic to the test do...
With the advent of deep learning, research in many areas of machine learning is converging towards t...
The current generation of neural network-based natural language processing models excels at learning...
Scarcity of parallel sentence-pairs poses a significant hurdle for training high-quality Neural Mach...
The field of natural language processing (NLP) has recently undergone a paradigm shift. Since the in...
We analyze the learning dynamics of neural language and translation models using Loss Change Allocat...
This paper presents two techniques for language model (LM) adaptation. The first aims to build a mor...
Substantial progress has been made in the field of natural language processing (NLP) due to the adve...
This paper introduces a new data augmentation method for neural machine translation that can enforce...
With the recent success of deep learning methods, neural-based models have achieved superior perform...
Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to...
The performance decay experienced by deep neural networks (DNNs) when confronted with distributional...
In recent years, the field of language modelling has witnessed exciting developments. Especially, th...
Deep learning has achieved state-of-the-art performance on a wide range of tasks, including natural ...
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...
This thesis aims for general robust Neural Machine Translation (NMT) that is agnostic to the test do...
With the advent of deep learning, research in many areas of machine learning is converging towards t...
The current generation of neural network-based natural language processing models excels at learning...
Scarcity of parallel sentence-pairs poses a significant hurdle for training high-quality Neural Mach...
The field of natural language processing (NLP) has recently undergone a paradigm shift. Since the in...
We analyze the learning dynamics of neural language and translation models using Loss Change Allocat...
This paper presents two techniques for language model (LM) adaptation. The first aims to build a mor...
Substantial progress has been made in the field of natural language processing (NLP) due to the adve...
This paper introduces a new data augmentation method for neural machine translation that can enforce...
With the recent success of deep learning methods, neural-based models have achieved superior perform...
Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to...
The performance decay experienced by deep neural networks (DNNs) when confronted with distributional...
In recent years, the field of language modelling has witnessed exciting developments. Especially, th...
Deep learning has achieved state-of-the-art performance on a wide range of tasks, including natural ...
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...