Order statistics and estimating cardinalities of massive data sets

Giroire, Frédéric

Publication date

January 2005

DOI

Abstract

We introduce a new class of algorithms to estimate the cardinality of very large multisets using constant memory and doing only one pass on the data. It is based on order statistics rather that on bit patterns in binary representations of numbers. We analyse three families of estimators. They attain a standard error of $\frac{1}{\sqrt{M}}$ using $M$ units of storage, which places them in the same class as the best known algorithms so far. They have a very simple internal loop, which gives them an advantage in term of processing speed. The algorithms are validated on internet traffic traces

Extracted data

We use cookies to provide a better user experience.

Data Protection

Order statistics and estimating cardinalities of massive data sets

Abstract

Extracted data

Order statistics and estimating cardinalities of massive data sets

Abstract

Extracted data

Related items

Related items