In low resource settings, data augmentation strategies are commonly leveraged to improve performance. Numerous approaches have attempted document-level augmentation (e.g., text classification), but few studies have explored token-level augmentation. Performed naively, data augmentation can produce semantically incongruent and ungrammatical examples. In this work, we compare simple masked language model replacement and an augmentation method using constituency tree mutations to improve the performance of named entity recognition in low-resource settings with the aim of preserving linguistic cohesion of the augmented sentences.Comment: submitted to Pattern-based Approaches to NLP in the Age of Deep Learning 2022 (Pan-DL 2022
We have trained a named entity recognition (NER) model that screens Swedish job ads for different ki...
Data augmentation is widely used in text classification, especially in the low-resource regime where...
As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where deep learn...
Text augmentation is a technique for constructing synthetic data from an under-resourced corpus to i...
Data augmentation is an effective approach to tackle over-fitting. Many previous works have proposed...
Pretrained multilingual language models have become a common tool in transferring NLP capabilities t...
This paper reports on the evaluation of Deep Learning (DL) transformer architecture models for Named...
In recent years, language models (LMs) have made remarkable progress in advancing the field of natu...
Pre-trained Language Models (PLMs) have been applied in NLP tasks and achieve promising results. Nev...
The objective of this thesis is to develop text augmentation approaches for Name Entity Recognition...
In many cases of machine learning, research suggests that the development of training data might hav...
Data augmentation, the artificial creation of training data for machine learning by transformations,...
Data augmentation methods are often used to address data scarcity in natural language processing (NL...
Linguistic annotation is time-consuming and expensive. One common annotation task is to mark entitie...
We present three large-scale experiments on binary text matching classification task both in Chinese...
We have trained a named entity recognition (NER) model that screens Swedish job ads for different ki...
Data augmentation is widely used in text classification, especially in the low-resource regime where...
As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where deep learn...
Text augmentation is a technique for constructing synthetic data from an under-resourced corpus to i...
Data augmentation is an effective approach to tackle over-fitting. Many previous works have proposed...
Pretrained multilingual language models have become a common tool in transferring NLP capabilities t...
This paper reports on the evaluation of Deep Learning (DL) transformer architecture models for Named...
In recent years, language models (LMs) have made remarkable progress in advancing the field of natu...
Pre-trained Language Models (PLMs) have been applied in NLP tasks and achieve promising results. Nev...
The objective of this thesis is to develop text augmentation approaches for Name Entity Recognition...
In many cases of machine learning, research suggests that the development of training data might hav...
Data augmentation, the artificial creation of training data for machine learning by transformations,...
Data augmentation methods are often used to address data scarcity in natural language processing (NL...
Linguistic annotation is time-consuming and expensive. One common annotation task is to mark entitie...
We present three large-scale experiments on binary text matching classification task both in Chinese...
We have trained a named entity recognition (NER) model that screens Swedish job ads for different ki...
Data augmentation is widely used in text classification, especially in the low-resource regime where...
As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where deep learn...