Can we construct a neural language model which is inductively biased towards learning human language? Motivated by this question, we aim at constructing an informative prior for held-out languages on the task of character-level, open-vocabulary language modeling. We obtain this prior as the posterior over network weights conditioned on the data from a sample of training languages, which is approximated through Laplace’s method. Based on a large and diverse sample of languages, the use of our prior outperforms baseline models with an uninformative prior in both zero-shot and few-shot settings, showing that the prior is imbued with universal linguistic knowledge. Moreover, we harness broad language-specific information available for most lang...
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but...
Neural networks drive the success of natural language processing. A fundamental property of language...
How cross-linguistically applicable are NLP models, specifically language models? A fair comparison ...
Recurrent neural networks (RNNs) are exceptionally good models of distributions over natural languag...
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...
Language modeling has been widely used in the application of natural language processing, and there...
Most combinations of NLP tasks and language varieties lack in-domain examples for supervised trainin...
Since language models are used to model a wide variety of languages, it is natural to ask whether th...
Pretraining deep neural networks to perform language modeling - that is, to reconstruct missing word...
Why do artificial neural networks model language so well? We claim that in order to answer this ques...
Currently, N-gram models are the most common and widely used models for statistical language modelin...
We examine the inductive inference of a complex grammar - specifically, we consider the task of trai...
Language is central to human intelligence. We review recent break- throughs in machine language proc...
Scaling existing applications and solutions to multiple human languages has traditionally proven to ...
In recent years, the field of language modelling has witnessed exciting developments. Especially, th...
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but...
Neural networks drive the success of natural language processing. A fundamental property of language...
How cross-linguistically applicable are NLP models, specifically language models? A fair comparison ...
Recurrent neural networks (RNNs) are exceptionally good models of distributions over natural languag...
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...
Language modeling has been widely used in the application of natural language processing, and there...
Most combinations of NLP tasks and language varieties lack in-domain examples for supervised trainin...
Since language models are used to model a wide variety of languages, it is natural to ask whether th...
Pretraining deep neural networks to perform language modeling - that is, to reconstruct missing word...
Why do artificial neural networks model language so well? We claim that in order to answer this ques...
Currently, N-gram models are the most common and widely used models for statistical language modelin...
We examine the inductive inference of a complex grammar - specifically, we consider the task of trai...
Language is central to human intelligence. We review recent break- throughs in machine language proc...
Scaling existing applications and solutions to multiple human languages has traditionally proven to ...
In recent years, the field of language modelling has witnessed exciting developments. Especially, th...
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but...
Neural networks drive the success of natural language processing. A fundamental property of language...
How cross-linguistically applicable are NLP models, specifically language models? A fair comparison ...