Data augmentation is widely used in text classification, especially in the low-resource regime where a few examples for each class are available during training. Despite the success, generating data augmentations as hard positive examples that may increase their effectiveness is under-explored. This paper proposes an Adversarial Word Dilution (AWD) method that can generate hard positive examples as text data augmentations to train the low-resource text classification model efficiently. Our idea of augmenting the text data is to dilute the embedding of strong positive words by weighted mixing with unknown-word embedding, making the augmented inputs hard to be recognized as positive by the classification model. We adversarially learn the dilu...
Text augmentation is a technique for constructing synthetic data from an under-resourced corpus to i...
Machine Learning is a sub-field of Artificial intelligence that aims to automatically improve algori...
Despite their promising performance across various natural language processing (NLP) tasks, current ...
Data augmentation is widely used in text classification, especially in the low-resource regime where...
Based on recent advances in natural language modeling and those in text generation capabilities, we ...
We propose a simple and general method to regularize the fine-tuning of Transformer-based encoders f...
In many cases of machine learning, research suggests that the development of training data might hav...
Text classification is a basic task in natural language processing, but the small character perturba...
Text has traditionally been used to train automated classifiers for a multitude of purposes, such as...
Data Augmentation approaches often use Language Models, pretrained on large quantities of unlabeled ...
© Springer Nature Switzerland AG 2020. Recently, generating adversarial examples has become an impor...
Imbalanced data constitute an extensively studied problem in the field of machine learning classific...
We study an important and challenging task of attacking natural language processing models in a hard...
We study the effect of different approaches to text augmentation. To do this we use three datasets t...
Machine Learning is a sub-field of Artificial intelligence that aims to automatically improve algori...
Text augmentation is a technique for constructing synthetic data from an under-resourced corpus to i...
Machine Learning is a sub-field of Artificial intelligence that aims to automatically improve algori...
Despite their promising performance across various natural language processing (NLP) tasks, current ...
Data augmentation is widely used in text classification, especially in the low-resource regime where...
Based on recent advances in natural language modeling and those in text generation capabilities, we ...
We propose a simple and general method to regularize the fine-tuning of Transformer-based encoders f...
In many cases of machine learning, research suggests that the development of training data might hav...
Text classification is a basic task in natural language processing, but the small character perturba...
Text has traditionally been used to train automated classifiers for a multitude of purposes, such as...
Data Augmentation approaches often use Language Models, pretrained on large quantities of unlabeled ...
© Springer Nature Switzerland AG 2020. Recently, generating adversarial examples has become an impor...
Imbalanced data constitute an extensively studied problem in the field of machine learning classific...
We study an important and challenging task of attacking natural language processing models in a hard...
We study the effect of different approaches to text augmentation. To do this we use three datasets t...
Machine Learning is a sub-field of Artificial intelligence that aims to automatically improve algori...
Text augmentation is a technique for constructing synthetic data from an under-resourced corpus to i...
Machine Learning is a sub-field of Artificial intelligence that aims to automatically improve algori...
Despite their promising performance across various natural language processing (NLP) tasks, current ...