Pre-trained language models (PLMs) that achieve success in applications are susceptible to adversarial attack methods that are capable of generating adversarial examples with minor perturbations. Although recent attack methods can achieve a relatively high attack success rate (ASR), our observation shows that the generated adversarial examples have a different data distribution compared with the original examples. Specifically, these adversarial examples exhibit lower confidence levels and higher distance to the training data distribution. As a result, they are easy to detect using very simple detection methods, diminishing the actual effectiveness of these attack methods. To solve this problem, we propose a Distribution-Aware LoRA-based Ad...
Recent work on adversarial attack has shown that Projected Gradient Descent (PGD) Adversary is a uni...
Adversarial attacks are a major concern in security-centered applications, where malicious actors co...
Deep neural networks are vulnerable to adversarial examples that are crafted by imposing imperceptib...
Despite the remarkable performance and generalization levels of deep learning models in a wide range...
Current machine learning models achieve super-human performance in many real-world applications. Sti...
The monumental achievements of deep learning (DL) systems seem to guarantee the absolute superiority...
Deep learning plays an important role in various disciplines, such as auto-driving, information tech...
Recent work in black-box adversarial attacks for NLP systems has attracted much attention. Prior bla...
Adversarial attacks in NLP challenge the way we look at language models. The goal of this kind of ad...
Deep Neural Networks are susceptible to adversarial perturbations. Adversarial training and adversar...
Machine learning has become an important component for many systems and applications including compu...
Adversarial training is the standard to train models robust against adversarial examples. However, e...
Recent studies have shown that natural language processing (NLP) models are vulnerable to adversaria...
Detecting adversarial examples currently stands as one of the biggest challenges in the field of dee...
Adversarial robustness continues to be a major challenge for deep learning. A core issue is that rob...
Recent work on adversarial attack has shown that Projected Gradient Descent (PGD) Adversary is a uni...
Adversarial attacks are a major concern in security-centered applications, where malicious actors co...
Deep neural networks are vulnerable to adversarial examples that are crafted by imposing imperceptib...
Despite the remarkable performance and generalization levels of deep learning models in a wide range...
Current machine learning models achieve super-human performance in many real-world applications. Sti...
The monumental achievements of deep learning (DL) systems seem to guarantee the absolute superiority...
Deep learning plays an important role in various disciplines, such as auto-driving, information tech...
Recent work in black-box adversarial attacks for NLP systems has attracted much attention. Prior bla...
Adversarial attacks in NLP challenge the way we look at language models. The goal of this kind of ad...
Deep Neural Networks are susceptible to adversarial perturbations. Adversarial training and adversar...
Machine learning has become an important component for many systems and applications including compu...
Adversarial training is the standard to train models robust against adversarial examples. However, e...
Recent studies have shown that natural language processing (NLP) models are vulnerable to adversaria...
Detecting adversarial examples currently stands as one of the biggest challenges in the field of dee...
Adversarial robustness continues to be a major challenge for deep learning. A core issue is that rob...
Recent work on adversarial attack has shown that Projected Gradient Descent (PGD) Adversary is a uni...
Adversarial attacks are a major concern in security-centered applications, where malicious actors co...
Deep neural networks are vulnerable to adversarial examples that are crafted by imposing imperceptib...