Text augmentation is a technique for constructing synthetic data from an under-resourced corpus to improve predictive performance. Synthetic data generation is common in numerous domains. However, recently text augmentation has emerged in natural language processing (NLP) to improve downstream tasks. One of the current state-of-the-art text augmentation techniques is easy data augmentation (EDA), which augments the training data by injecting and replacing synonyms and randomly permuting sentences. One major obstacle with EDA is the need for versatile and complete synonym dictionaries, which cannot be easily found in low-resource languages. To improve the utility of EDA, we propose two extensions, easy distributional data augmentation (EDDA)...
This paper investigates very low resource language model pretraining, when less than 100 thousand se...
The field of low-density NLP is often approached from an engineering perspective, and evaluations ar...
As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where deep learn...
In low resource settings, data augmentation strategies are commonly leveraged to improve performance...
Data augmentation is widely used in text classification, especially in the low-resource regime where...
In many cases of machine learning, research suggests that the development of training data might hav...
Pretrained multilingual language models have become a common tool in transferring NLP capabilities t...
We study the effect of different approaches to text augmentation. To do this we use three datasets t...
Large Language Models (LLMs) have demonstrated impressive zero shot performance on a wide range of N...
Data augmentation, the artificial creation of training data for machine learning by transformations,...
We tackle the problem of neural headline generation in a low-resource setting, where only limited am...
Recently there has been interest in the approaches for training speech recognition systems for langu...
Based on recent advances in natural language modeling and those in text generation capabilities, we ...
In the context of neural machine translation, data augmentation (DA) techniques may be used for gene...
Data Augmentation approaches often use Language Models, pretrained on large quantities of unlabeled ...
This paper investigates very low resource language model pretraining, when less than 100 thousand se...
The field of low-density NLP is often approached from an engineering perspective, and evaluations ar...
As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where deep learn...
In low resource settings, data augmentation strategies are commonly leveraged to improve performance...
Data augmentation is widely used in text classification, especially in the low-resource regime where...
In many cases of machine learning, research suggests that the development of training data might hav...
Pretrained multilingual language models have become a common tool in transferring NLP capabilities t...
We study the effect of different approaches to text augmentation. To do this we use three datasets t...
Large Language Models (LLMs) have demonstrated impressive zero shot performance on a wide range of N...
Data augmentation, the artificial creation of training data for machine learning by transformations,...
We tackle the problem of neural headline generation in a low-resource setting, where only limited am...
Recently there has been interest in the approaches for training speech recognition systems for langu...
Based on recent advances in natural language modeling and those in text generation capabilities, we ...
In the context of neural machine translation, data augmentation (DA) techniques may be used for gene...
Data Augmentation approaches often use Language Models, pretrained on large quantities of unlabeled ...
This paper investigates very low resource language model pretraining, when less than 100 thousand se...
The field of low-density NLP is often approached from an engineering perspective, and evaluations ar...
As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where deep learn...