A variety of schemes have been proposed in the literature to speed up query processing and analytics by incrementally maintaining a bounded-size uniform sample from a dataset in the presence of a sequence of insertion, deletion, and update transactions. These algorithms vary according to whether the dataset is an ordinary set or a multiset and whether the transaction sequence consists only of insertions or can include deletions and updates. We report on subtle non-uniformity issues that we found in a number of these prior bounded-size sampling schemes, including some of our own. We provide workarounds that can avoid the non-uniformity problem; these workarounds are easy to implement and incur negligible additional cost. We also consider the...
Distribution testing is a crucial area at the interface of statistics and algorithms, where one wish...
In a recent paper [MRL98], we had described a general framework for single pass approximate quantile...
Consistent sampling is a technique for specifying, in small space, a subset S of a potentially large...
A variety of schemes have been proposed in the literature to speed up query processing and analytics...
Perhaps the most flexible synopsis of a database is a uniform random sample of the data; such sample...
Perhaps the most flexible synopsis of a database is a random sample of the data; such samples are wi...
Perhaps the most flexible synopsis of a database is a uniform random sample of the data; such sample...
Random sampling has become a crucial component of modern data management systems. Although the liter...
Recent years have seen an unprecedented adoption of artificial intelligence in a wide variety of app...
The problem of uniform sampling is, given a formula F, sample solutions of F uniformly at random fro...
Uniform or near-uniform generation of solutions for large satisfiability formulas is a problem of th...
Abstract. Consistent sampling is a technique for specifying, in small space, a subset S of a potenti...
inc.com This paper investigates the practice of non-uniformly sam-pling records from a database. The...
© 2018 Society for Industrial and Applied Mathematics. In many situations, sample data is obtained f...
In decision support applications, the ability to provide fast approximate answers to aggregation que...
Distribution testing is a crucial area at the interface of statistics and algorithms, where one wish...
In a recent paper [MRL98], we had described a general framework for single pass approximate quantile...
Consistent sampling is a technique for specifying, in small space, a subset S of a potentially large...
A variety of schemes have been proposed in the literature to speed up query processing and analytics...
Perhaps the most flexible synopsis of a database is a uniform random sample of the data; such sample...
Perhaps the most flexible synopsis of a database is a random sample of the data; such samples are wi...
Perhaps the most flexible synopsis of a database is a uniform random sample of the data; such sample...
Random sampling has become a crucial component of modern data management systems. Although the liter...
Recent years have seen an unprecedented adoption of artificial intelligence in a wide variety of app...
The problem of uniform sampling is, given a formula F, sample solutions of F uniformly at random fro...
Uniform or near-uniform generation of solutions for large satisfiability formulas is a problem of th...
Abstract. Consistent sampling is a technique for specifying, in small space, a subset S of a potenti...
inc.com This paper investigates the practice of non-uniformly sam-pling records from a database. The...
© 2018 Society for Industrial and Applied Mathematics. In many situations, sample data is obtained f...
In decision support applications, the ability to provide fast approximate answers to aggregation que...
Distribution testing is a crucial area at the interface of statistics and algorithms, where one wish...
In a recent paper [MRL98], we had described a general framework for single pass approximate quantile...
Consistent sampling is a technique for specifying, in small space, a subset S of a potentially large...