Natural language processing (NLP) technology has been applied in various domains, ranging from social media and digital humanities to public health. Unfortunately, the adoption of existing NLP techniques in these areas often experiences unsatisfactory performance. Languages of new datasets and settings can be significantly different from standard NLP training corpora, and modern NLP techniques are usually vulnerable to variation in non-standard languages, in terms of the lexicon, syntax, and semantics. Previous approaches toward this problem suffer from three major weaknesses. First, they often employ supervised methods that require expensive annotations and easily become outdated with respect to the dynamic nature of languages. Second, the...
Addressing the cross-lingual variation of grammatical structures and meaning categorization is a key...
Native Language Identification (NLI) is the task of recognizing the native language of an author fro...
In this paper we present a Dutch and English dataset that can serve as a gold standard for evaluatin...
Natural language processing (NLP) technology has been applied in various domains, ranging from socia...
With the fast growth of the amount of digitalized texts in recent years, text information management...
Real world data differs radically from the benchmark corpora we use in natural language processing (...
The underlying traits of our demographic group affect and shape our thoughts, and therefore surface ...
Both Statistical Machine Translation and Neural Machine Translation (NMT) are data-dependent learnin...
Lexical normalization is the task of transforming an utterance into its standardized form. This task...
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...
This thesis aims for general robust Neural Machine Translation (NMT) that is agnostic to the test do...
Language variation and change are ubiquitous, and one aim of linguistic research is to understand sy...
Lexical normalization is the task of transforming an utterance into its standardized form. This task...
The advancement of neural network models has led to state-of-the-art performance in a wide range of ...
Addressing the cross-lingual variation of grammatical structures and meaning categorization is a key...
Native Language Identification (NLI) is the task of recognizing the native language of an author fro...
In this paper we present a Dutch and English dataset that can serve as a gold standard for evaluatin...
Natural language processing (NLP) technology has been applied in various domains, ranging from socia...
With the fast growth of the amount of digitalized texts in recent years, text information management...
Real world data differs radically from the benchmark corpora we use in natural language processing (...
The underlying traits of our demographic group affect and shape our thoughts, and therefore surface ...
Both Statistical Machine Translation and Neural Machine Translation (NMT) are data-dependent learnin...
Lexical normalization is the task of transforming an utterance into its standardized form. This task...
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...
This thesis aims for general robust Neural Machine Translation (NMT) that is agnostic to the test do...
Language variation and change are ubiquitous, and one aim of linguistic research is to understand sy...
Lexical normalization is the task of transforming an utterance into its standardized form. This task...
The advancement of neural network models has led to state-of-the-art performance in a wide range of ...
Addressing the cross-lingual variation of grammatical structures and meaning categorization is a key...
Native Language Identification (NLI) is the task of recognizing the native language of an author fro...
In this paper we present a Dutch and English dataset that can serve as a gold standard for evaluatin...