Large Pre-Trained Language Models (PLMs) have facilitated and dominated many NLP tasks in recent years. However, despite the great success of PLMs, there are also privacy concerns brought with PLMs. For example, recent studies show that PLMs memorize a lot of training data, including sensitive information, while the information may be leaked unintentionally and be utilized by malicious attackers. In this paper, we propose to measure whether PLMs are prone to leaking personal information. Specifically, we attempt to query PLMs for email addresses with contexts of the email address or prompts containing the owner's name. We find that PLMs do leak personal information due to memorization. However, the risk of specific personal information be...
Email is undoubtedly the most used communications mechanism in society today. Within business alone,...
Email is undoubtedly the most used communications mechanism in society today. Within business alone,...
Abstract: In this paper we describe an approach to information assurance in which we can prevent bre...
peer reviewedThe rapid advancement and widespread use of large language models (LLMs) have raised si...
peer reviewedThe rapid advancement and widespread use of large language models (LLMs) have raised si...
The rapid advancement and widespread use of large language models (LLMs) have raised significant con...
The wide adoption and application of Masked language models~(MLMs) on sensitive data (from legal to ...
With the wide availability of large pre-trained language models such as GPT-2 and BERT, the recent t...
Large language models are shown to present privacy risks through memorization of training data, and ...
International audienceWith the rise of machine learning and data-driven models especially in the fie...
Large multimodal language models have proven transformative in numerous applications. However, these...
International audienceWith the rise of machine learning and data-driven models especially in the fie...
Recent work has demonstrated the successful extraction of training data from generative language mod...
THIS IS PART OF THE ACCEPTED PAPER "Do Language Models Plagiarize?" @ Proceedings of the ACM Web Con...
Fine-tuning is a common and effective method for tailoring large language models (LLMs) to specializ...
Email is undoubtedly the most used communications mechanism in society today. Within business alone,...
Email is undoubtedly the most used communications mechanism in society today. Within business alone,...
Abstract: In this paper we describe an approach to information assurance in which we can prevent bre...
peer reviewedThe rapid advancement and widespread use of large language models (LLMs) have raised si...
peer reviewedThe rapid advancement and widespread use of large language models (LLMs) have raised si...
The rapid advancement and widespread use of large language models (LLMs) have raised significant con...
The wide adoption and application of Masked language models~(MLMs) on sensitive data (from legal to ...
With the wide availability of large pre-trained language models such as GPT-2 and BERT, the recent t...
Large language models are shown to present privacy risks through memorization of training data, and ...
International audienceWith the rise of machine learning and data-driven models especially in the fie...
Large multimodal language models have proven transformative in numerous applications. However, these...
International audienceWith the rise of machine learning and data-driven models especially in the fie...
Recent work has demonstrated the successful extraction of training data from generative language mod...
THIS IS PART OF THE ACCEPTED PAPER "Do Language Models Plagiarize?" @ Proceedings of the ACM Web Con...
Fine-tuning is a common and effective method for tailoring large language models (LLMs) to specializ...
Email is undoubtedly the most used communications mechanism in society today. Within business alone,...
Email is undoubtedly the most used communications mechanism in society today. Within business alone,...
Abstract: In this paper we describe an approach to information assurance in which we can prevent bre...