We present a novel benchmark and associated evaluation metrics for assessing the performance of text anonymization methods. Text anonymization, defined as the task of editing a text document to prevent the disclosure of personal information, currently suffers from a shortage of privacy-oriented annotated text resources, making it difficult to properly evaluate the level of privacy protection offered by various anonymization methods. This paper presents TAB (Text Anonymization Benchmark), a new, open-source annotated corpus developed to address this shortage. The corpus comprises 1,268 English-language court cases from the European Court of Human Rights (ECHR) enriched with comprehensive annotations about the personal information appearing i...
As a consequence of a recent curation project, the Dortmund Chat Corpus is available in CLARIN-D res...
In order to provide open access to data of public interest, it is often necessary to perform several...
This paper presents the results and analyses stemming from the first VoicePrivacy 2020 Challenge whi...
In the European Union, Data Controllers and Data Processors, who work with personal data, have to co...
Textual resources annotation is currently performed both manually by human experts selecting hand-cr...
The collection, publication, and mining of personal data have become key drivers of innovation and v...
The vast amount of data being collected about individuals has brought new challenges in protecting t...
Anonymity of both natural and legal persons in court rulings is a critical aspect of privacy protect...
Data sharing is a central aspect of judicial systems. The openly accessible documents can make the j...
Sharing data in the form of text is important for a wide range of activities but it also raises a co...
Publisher Copyright: © 2022 Copyright for this paper by its authors.The EU General Data Protection R...
While vast amounts of personal data are shared daily on public online platforms and used by companie...
International audienceThe VoicePrivacy initiative aims to promote the development of privacy preserv...
Most of the recent efforts addressing the issue of privacy have focused on devising algorithms for t...
The objective of this thesis is to make it easier to understand, use, and deploy strong anonymizatio...
As a consequence of a recent curation project, the Dortmund Chat Corpus is available in CLARIN-D res...
In order to provide open access to data of public interest, it is often necessary to perform several...
This paper presents the results and analyses stemming from the first VoicePrivacy 2020 Challenge whi...
In the European Union, Data Controllers and Data Processors, who work with personal data, have to co...
Textual resources annotation is currently performed both manually by human experts selecting hand-cr...
The collection, publication, and mining of personal data have become key drivers of innovation and v...
The vast amount of data being collected about individuals has brought new challenges in protecting t...
Anonymity of both natural and legal persons in court rulings is a critical aspect of privacy protect...
Data sharing is a central aspect of judicial systems. The openly accessible documents can make the j...
Sharing data in the form of text is important for a wide range of activities but it also raises a co...
Publisher Copyright: © 2022 Copyright for this paper by its authors.The EU General Data Protection R...
While vast amounts of personal data are shared daily on public online platforms and used by companie...
International audienceThe VoicePrivacy initiative aims to promote the development of privacy preserv...
Most of the recent efforts addressing the issue of privacy have focused on devising algorithms for t...
The objective of this thesis is to make it easier to understand, use, and deploy strong anonymizatio...
As a consequence of a recent curation project, the Dortmund Chat Corpus is available in CLARIN-D res...
In order to provide open access to data of public interest, it is often necessary to perform several...
This paper presents the results and analyses stemming from the first VoicePrivacy 2020 Challenge whi...