Recent studies show that pre-trained language models (LMs) are vulnerable to textual adversarial attacks. However, existing attack methods either suffer from low attack success rates or fail to search efficiently in the exponentially large perturbation space. We propose an efficient and effective framework SemAttack to generate natural adversarial text by constructing different semantic perturbation functions. In particular, SemAttack optimizes the generated perturbations constrained on generic semantic spaces, including typo space, knowledge space (e.g., WordNet), contextualized semantic space (e.g., the embedding space of BERT clusterings), or the combination of these spaces. Thus, the generated adversarial texts are more semantically clo...
Hard-label textual adversarial attack is a challenging task, as only the predicted label information...
Neural language models show vulnerability to adversarial examples which are semantically similar to ...
NLP researchers propose different word-substitute black-box attacks that can fool text classificatio...
Machine learning algorithms are often vulnerable to adversarial examples that have imperceptible alt...
Recent studies have shown that natural language processing (NLP) models are vulnerable to adversaria...
Adversarial attacks in NLP challenge the way we look at language models. The goal of this kind of ad...
We study an important and challenging task of attacking natural language processing models in a hard...
Semantic parsing is a technique aimed at constructing a structured representation of the meaning of ...
We study an important task of attacking natural language processing models in a black box setting. W...
The monumental achievements of deep learning (DL) systems seem to guarantee the absolute superiority...
Named Entity Recognition is a fundamental task in information extraction and is an essential element...
We attribute the vulnerability of natural language processing models to the fact that similar inputs...
Modern text classification models are susceptible to adversarial examples, perturbed versions of the...
Large language models (LLMs) are susceptible to red teaming attacks, which can induce LLMs to genera...
Despite their promising performance across various natural language processing (NLP) tasks, current ...
Hard-label textual adversarial attack is a challenging task, as only the predicted label information...
Neural language models show vulnerability to adversarial examples which are semantically similar to ...
NLP researchers propose different word-substitute black-box attacks that can fool text classificatio...
Machine learning algorithms are often vulnerable to adversarial examples that have imperceptible alt...
Recent studies have shown that natural language processing (NLP) models are vulnerable to adversaria...
Adversarial attacks in NLP challenge the way we look at language models. The goal of this kind of ad...
We study an important and challenging task of attacking natural language processing models in a hard...
Semantic parsing is a technique aimed at constructing a structured representation of the meaning of ...
We study an important task of attacking natural language processing models in a black box setting. W...
The monumental achievements of deep learning (DL) systems seem to guarantee the absolute superiority...
Named Entity Recognition is a fundamental task in information extraction and is an essential element...
We attribute the vulnerability of natural language processing models to the fact that similar inputs...
Modern text classification models are susceptible to adversarial examples, perturbed versions of the...
Large language models (LLMs) are susceptible to red teaming attacks, which can induce LLMs to genera...
Despite their promising performance across various natural language processing (NLP) tasks, current ...
Hard-label textual adversarial attack is a challenging task, as only the predicted label information...
Neural language models show vulnerability to adversarial examples which are semantically similar to ...
NLP researchers propose different word-substitute black-box attacks that can fool text classificatio...