Summary: Differential privacy allows quantifying privacy loss resulting from accession of sensitive personal data. Repeated accesses to underlying data incur increasing loss. Releasing data as privacy-preserving synthetic data would avoid this limitation but would leave open the problem of designing what kind of synthetic data. We propose formulating the problem of private data release through probabilistic modeling. This approach transforms the problem of designing the synthetic data into choosing a model for the data, allowing also the inclusion of prior knowledge, which improves the quality of the synthetic data. We demonstrate empirically, in an epidemiological study, that statistical discoveries can be reliably reproduced from the synt...
There has been increasing interest in the problem of building accurate data mining models over aggre...
The availability of genomic data is essential to progress in biomedical research, personalized medi...
Modern business creates an increasing need for sharing, querying and mining informa-tion across auto...
Differential privacy allows quantifying privacy loss resulting from accession of sensitive personal ...
How can we share sensitive datasets in such a way as to maximize utility while simultaneously safegu...
With the growing concerns over data privacy and new regulations like the General Data Protection Reg...
Synthetic data has been advertised as a silver-bullet solution to privacy-preserving data publishing...
These talks were presented for the Privacy Day Webinar 2022 sponsored by the American Statistical As...
The predictive potential of the many large datasets being held in healthcare, financial markets, soc...
When releasing data for public use, statistical agencies seek to reduce the risk of disclosure, whil...
An individual's personal information is gathered by a multitude of different data collectors through...
Medical data often contain sensitive personal information about individuals, posing significant limi...
239 pagesIn modern settings of data analysis, we may be running our algorithms on datasets that are ...
Methods for privacy-preserving data publishing and analysis trade off privacy risks for individuals ...
<p>Many organizations collect data that would be useful to public researchers, but cannot be shared ...
There has been increasing interest in the problem of building accurate data mining models over aggre...
The availability of genomic data is essential to progress in biomedical research, personalized medi...
Modern business creates an increasing need for sharing, querying and mining informa-tion across auto...
Differential privacy allows quantifying privacy loss resulting from accession of sensitive personal ...
How can we share sensitive datasets in such a way as to maximize utility while simultaneously safegu...
With the growing concerns over data privacy and new regulations like the General Data Protection Reg...
Synthetic data has been advertised as a silver-bullet solution to privacy-preserving data publishing...
These talks were presented for the Privacy Day Webinar 2022 sponsored by the American Statistical As...
The predictive potential of the many large datasets being held in healthcare, financial markets, soc...
When releasing data for public use, statistical agencies seek to reduce the risk of disclosure, whil...
An individual's personal information is gathered by a multitude of different data collectors through...
Medical data often contain sensitive personal information about individuals, posing significant limi...
239 pagesIn modern settings of data analysis, we may be running our algorithms on datasets that are ...
Methods for privacy-preserving data publishing and analysis trade off privacy risks for individuals ...
<p>Many organizations collect data that would be useful to public researchers, but cannot be shared ...
There has been increasing interest in the problem of building accurate data mining models over aggre...
The availability of genomic data is essential to progress in biomedical research, personalized medi...
Modern business creates an increasing need for sharing, querying and mining informa-tion across auto...