We introduce a new class of algorithms to estimate the cardinality of very large multisets using constant memory and doing only one pass on the data. It is based on order statistics rather that on bit patterns in binary representations of numbers. We analyse three families of estimators. They attain a standard error of $\frac{1}{\sqrt{M}}$ using $M$ units of storage, which places them in the same class as the best known algorithms so far. They have a very simple internal loop, which gives them an advantage in term of processing speed. The algorithms are validated on internet traffic traces
This article is a companion to an invited talk at ICDT\u272022 with the same title. Cardinality esti...
Adaptive sampling [a1] is a probabilistic algorithm invented by M. Wegman (unpublished) around 1980....
Cardinality estimation is an important component of query optimization. Its accuracy and efficiency ...
AbstractA new class of algorithms to estimate the cardinality of very large multisets using constant...
International audienceWe introduce a new class of algorithms to estimate the cardinality of very lar...
This article considers the problem of cardinality estimation in data stream applications. We present...
This extended abstract describes and analyses a near-optimal probabilistic algorithm, HYPERLOGLOG, d...
Giroire has recently proposed an algorithm which returns the $\textit{approximate}$ number of distin...
Counting in general, and estimating the cardinality of (multi-) sets in particular, is highly desira...
Counting in general, and estimating the cardinality of (multi-) sets in particular, is highly desira...
This extended abstract describes and analyses a near-optimal probabilistic algorithm, HYPERLOGLOG, d...
Abstract. This text is an informal review of several randomized algorithms that have appeared over t...
This book presents several compact and fast methods for online traffic measurement of big network da...
Abstract Many sketches based on estimator sharing have been proposed to estimate cardinality with hu...
Statistics computation over data streams is often required by many applications, including processin...
This article is a companion to an invited talk at ICDT\u272022 with the same title. Cardinality esti...
Adaptive sampling [a1] is a probabilistic algorithm invented by M. Wegman (unpublished) around 1980....
Cardinality estimation is an important component of query optimization. Its accuracy and efficiency ...
AbstractA new class of algorithms to estimate the cardinality of very large multisets using constant...
International audienceWe introduce a new class of algorithms to estimate the cardinality of very lar...
This article considers the problem of cardinality estimation in data stream applications. We present...
This extended abstract describes and analyses a near-optimal probabilistic algorithm, HYPERLOGLOG, d...
Giroire has recently proposed an algorithm which returns the $\textit{approximate}$ number of distin...
Counting in general, and estimating the cardinality of (multi-) sets in particular, is highly desira...
Counting in general, and estimating the cardinality of (multi-) sets in particular, is highly desira...
This extended abstract describes and analyses a near-optimal probabilistic algorithm, HYPERLOGLOG, d...
Abstract. This text is an informal review of several randomized algorithms that have appeared over t...
This book presents several compact and fast methods for online traffic measurement of big network da...
Abstract Many sketches based on estimator sharing have been proposed to estimate cardinality with hu...
Statistics computation over data streams is often required by many applications, including processin...
This article is a companion to an invited talk at ICDT\u272022 with the same title. Cardinality esti...
Adaptive sampling [a1] is a probabilistic algorithm invented by M. Wegman (unpublished) around 1980....
Cardinality estimation is an important component of query optimization. Its accuracy and efficiency ...