Generating Black-Box Adversarial Examples for Text Classifiers Using a Deep Reinforced Model

Vijayaraghavan, P
Roy, D

Open link

Publication date

July 2021

DOI

10.1007/978-3-030-46147-8_43

Publisher

Springer Science and Business Media LLC

Abstract

© Springer Nature Switzerland AG 2020. Recently, generating adversarial examples has become an important means of measuring robustness of a deep learning model. Adversarial examples help us identify the susceptibilities of the model and further counter those vulnerabilities by applying adversarial training techniques. In natural language domain, small perturbations in the form of misspellings or paraphrases can drastically change the semantics of the text. We propose a reinforcement learning based approach towards generating adversarial examples in black-box settings. We demonstrate that our method is able to fool well-trained models for (a) IMDB sentiment classification task and (b) AG’s news corpus news categorization task with significan...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Generating Black-Box Adversarial Examples for Text Classifiers Using a Deep Reinforced Model

Abstract

Extracted data

Generating Black-Box Adversarial Examples for Text Classifiers Using a Deep Reinforced Model

Abstract

Extracted data

Related items

Related items