We consider the problem of learning a discrete distribution in the presence of an epsilon fraction of malicious data sources. Specifically, we consider the setting where there is some underlying distribution, p, and each data source provides a batch of >= k samples, with the guarantee that at least a (1 - epsilon) fraction of the sources draw their samples from a distribution with total variation distance at most eta from p. We make no assumptions on the data provided by the remaining epsilon fraction of sources--this data can even be chosen as an adversarial function of the (1 - epsilon) fraction of "good" batches. We provide two algorithms: one with runtime exponential in the support size, n, but polynomial in k, 1/epsilon and 1/eta that ...
We consider a setup in which confidential i.i.d. samples $X_1,\dotsc,X_n$ from an unknown finite-sup...
AbstractWe investigate learning of classes of distributions over a discrete domain in a PAC context....
International audienceIn information-hiding, an adversary that tries to infer the secret information...
© 2020 ACM. We study the problem, introduced by Qiao and Valiant, of learning from untrusted batches...
© 2020 ACM. We study the problem, introduced by Qiao and Valiant, of learning from untrusted batches...
We study the problem of learning from unlabeled samples very general statistical mixture models on l...
Modern machine learning methods often require more data for training than a single expert can provid...
Modern machine learning methods often require more data for training than a single expert can provid...
We consider the problem of learning mixtures of product distributions over discrete domains in the d...
We consider a general statistical learning problem where an unknown fraction of the training data is...
International audienceIn information-hiding, an adversary that tries to infer the secret information...
A large body of work shows that machine learning (ML) models can leak sensitive or confidential info...
We study the problem of testing discrete distributions with a focus on the high probability regime. ...
International audienceIn information-hiding, an adversary that tries to infer the secret information...
We give an algorithm for learning a mixture of unstructured distributions. This problem arises in va...
We consider a setup in which confidential i.i.d. samples $X_1,\dotsc,X_n$ from an unknown finite-sup...
AbstractWe investigate learning of classes of distributions over a discrete domain in a PAC context....
International audienceIn information-hiding, an adversary that tries to infer the secret information...
© 2020 ACM. We study the problem, introduced by Qiao and Valiant, of learning from untrusted batches...
© 2020 ACM. We study the problem, introduced by Qiao and Valiant, of learning from untrusted batches...
We study the problem of learning from unlabeled samples very general statistical mixture models on l...
Modern machine learning methods often require more data for training than a single expert can provid...
Modern machine learning methods often require more data for training than a single expert can provid...
We consider the problem of learning mixtures of product distributions over discrete domains in the d...
We consider a general statistical learning problem where an unknown fraction of the training data is...
International audienceIn information-hiding, an adversary that tries to infer the secret information...
A large body of work shows that machine learning (ML) models can leak sensitive or confidential info...
We study the problem of testing discrete distributions with a focus on the high probability regime. ...
International audienceIn information-hiding, an adversary that tries to infer the secret information...
We give an algorithm for learning a mixture of unstructured distributions. This problem arises in va...
We consider a setup in which confidential i.i.d. samples $X_1,\dotsc,X_n$ from an unknown finite-sup...
AbstractWe investigate learning of classes of distributions over a discrete domain in a PAC context....
International audienceIn information-hiding, an adversary that tries to infer the secret information...