Data augmentation is widely used in text classification, especially in the low-resource regime where a few examples for each class are available during training. Despite the success, generating data augmentations as hard positive examples that may increase their effectiveness is under-explored. This paper proposes an Adversarial Word Dilution (AWD) method that can generate hard positive examples as text data augmentations to train the low-resource text classification model efficiently. Our idea of augmenting the text data is to dilute the embedding of strong positive words by weighted mixing with unknown-word embedding, making the augmented inputs hard to be recognized as positive by the classification model. We adversarially learn the dilu...
In this paper, we explore how to utilize pre-trained language model to perform few-shot text classif...
High-quality instruction-tuning data is critical to improving LLM capabilities. Existing data collec...
Large Language Models (LLMs) have demonstrated impressive zero shot performance on a wide range of N...
Data augmentation is widely used in text classification, especially in the low-resource regime where...
Text augmentation is a technique for constructing synthetic data from an under-resourced corpus to i...
In many cases of machine learning, research suggests that the development of training data might hav...
Data augmentation, the artificial creation of training data for machine learning by transformations,...
In low resource settings, data augmentation strategies are commonly leveraged to improve performance...
We study the effect of different approaches to text augmentation. To do this we use three datasets t...
We present three large-scale experiments on binary text matching classification task both in Chinese...
Data augmentation techniques are widely used for enhancing the performance of machine learning model...
Thanks to increases in computing power and the growing availability of large datasets, neural netwo...
In recent years, language models (LMs) have made remarkable progress in advancing the field of natu...
As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where deep learn...
Modern text classification models are susceptible to adversarial examples, perturbed versions of the...
In this paper, we explore how to utilize pre-trained language model to perform few-shot text classif...
High-quality instruction-tuning data is critical to improving LLM capabilities. Existing data collec...
Large Language Models (LLMs) have demonstrated impressive zero shot performance on a wide range of N...
Data augmentation is widely used in text classification, especially in the low-resource regime where...
Text augmentation is a technique for constructing synthetic data from an under-resourced corpus to i...
In many cases of machine learning, research suggests that the development of training data might hav...
Data augmentation, the artificial creation of training data for machine learning by transformations,...
In low resource settings, data augmentation strategies are commonly leveraged to improve performance...
We study the effect of different approaches to text augmentation. To do this we use three datasets t...
We present three large-scale experiments on binary text matching classification task both in Chinese...
Data augmentation techniques are widely used for enhancing the performance of machine learning model...
Thanks to increases in computing power and the growing availability of large datasets, neural netwo...
In recent years, language models (LMs) have made remarkable progress in advancing the field of natu...
As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where deep learn...
Modern text classification models are susceptible to adversarial examples, perturbed versions of the...
In this paper, we explore how to utilize pre-trained language model to perform few-shot text classif...
High-quality instruction-tuning data is critical to improving LLM capabilities. Existing data collec...
Large Language Models (LLMs) have demonstrated impressive zero shot performance on a wide range of N...