Most combinations of NLP tasks and language varieties lack in-domain examples for supervised training because of the paucity of annotated data. How can neural models make sample-efficient generalizations from task–language combinations with available data to low-resource ones? In this work, we propose a Bayesian generative model for the space of neural parameters. We assume that this space can be factorized into latent variables for each language and each task. We infer the posteriors over such latent variables based on data from seen task–language combinations through variational inference. This enables zero-shot classification on unseen combinations at prediction time. For instance, given training data for named entity recognition (NER) i...
With the advent of deep neural networks in recent years, Neural Machine Translation (NMT) systems ha...
Neural Machine Translation (NMT) models are typically trained by considering humans as end-users and...
We investigate the problem of unsupervised part-of-speech tagging when raw parallel data is availabl...
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...
International audienceLarge language models have recently been shown to attain reasonable zero-shot ...
An important concern in training multilingual neural machine translation (NMT) is to translate betwe...
Can we construct a neural language model which is inductively biased towards learning human language...
Pretrained language models (PLMs) have demonstrated remarkable performance in various natural langua...
Natural language is rich with layers of implicit structure, and previous research has shown that we ...
The quality of translations produced by statistical machine translation (SMT) systems crucially depe...
Recent research has shown promise in multilingual modeling, demonstrating how a single model is capa...
<p>For resource rich languages, recent works have shown Neural Network based Language Models (NNLMs)...
Multilingual Neural Machine Translation (MNMT) for low- resource languages (LRL) can be enhanced by ...
Large-scale generative language models such as GPT-3 are competitive few-shot learners. While these ...
Multilingual neural machine translation has shown the capability of directly translating between lan...
With the advent of deep neural networks in recent years, Neural Machine Translation (NMT) systems ha...
Neural Machine Translation (NMT) models are typically trained by considering humans as end-users and...
We investigate the problem of unsupervised part-of-speech tagging when raw parallel data is availabl...
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...
International audienceLarge language models have recently been shown to attain reasonable zero-shot ...
An important concern in training multilingual neural machine translation (NMT) is to translate betwe...
Can we construct a neural language model which is inductively biased towards learning human language...
Pretrained language models (PLMs) have demonstrated remarkable performance in various natural langua...
Natural language is rich with layers of implicit structure, and previous research has shown that we ...
The quality of translations produced by statistical machine translation (SMT) systems crucially depe...
Recent research has shown promise in multilingual modeling, demonstrating how a single model is capa...
<p>For resource rich languages, recent works have shown Neural Network based Language Models (NNLMs)...
Multilingual Neural Machine Translation (MNMT) for low- resource languages (LRL) can be enhanced by ...
Large-scale generative language models such as GPT-3 are competitive few-shot learners. While these ...
Multilingual neural machine translation has shown the capability of directly translating between lan...
With the advent of deep neural networks in recent years, Neural Machine Translation (NMT) systems ha...
Neural Machine Translation (NMT) models are typically trained by considering humans as end-users and...
We investigate the problem of unsupervised part-of-speech tagging when raw parallel data is availabl...