This paper develops a new mathematical-statistical approach to analyze a class of Flajolet-Martin algorithms (FMa), and provides analytical confidence intervals for the number F0 of distinct elements in a stream, based on Chernoff bounds. The class of FMa has reached a significant popularity in bigdata stream learning, and the attention of the literature has mainly been based on algorithmic aspects, basically complexity optimality, while the statistical analysis of these class of algorithms has been often faced heuristically. The analysis provided here shows deep connections with mathematical special functions and with extreme value theory. The latter connection may help in explaining heuristic considerations, while the first opens many num...
We consider the problem of approximating the frequency of frequently occurring elements in a stream ...
Estimating ranks, quantiles, and distributions over streaming data is a central task in data analysi...
In this paper, we provide the first optimal algorithm for the remaining open question from the semin...
AbstractWe present two new algorithms for the range-efficient F0 estimating problem and improve the ...
International audienceWe investigate the problem of estimating on the fly the frequency at which ite...
We consider data streams of transactions that are generated independently with some non-stationary d...
In insertion-only streaming, one sees a sequence of indices a_1, a_2, ..., a_m in [n]. The stream de...
This thesis examines two types of problems-that of analyzing large quantities of real data, and the ...
This electronic version was submitted by the student author. The certified thesis is available in th...
Estimating the cardinality (i.e. number of distinct elements) of an arbitrary set expression dened o...
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...
We study the problem of estimating the largest gain of an unknown linear and time-invariant filter, ...
Maintaining frequency counts for data streams has attracted much interest among the research communi...
In the adversarially robust streaming model, a stream of elements is presented to an algorithm and i...
The exact computation of the number of distinct elements (frequency moment F0) is a fundamental prob...
We consider the problem of approximating the frequency of frequently occurring elements in a stream ...
Estimating ranks, quantiles, and distributions over streaming data is a central task in data analysi...
In this paper, we provide the first optimal algorithm for the remaining open question from the semin...
AbstractWe present two new algorithms for the range-efficient F0 estimating problem and improve the ...
International audienceWe investigate the problem of estimating on the fly the frequency at which ite...
We consider data streams of transactions that are generated independently with some non-stationary d...
In insertion-only streaming, one sees a sequence of indices a_1, a_2, ..., a_m in [n]. The stream de...
This thesis examines two types of problems-that of analyzing large quantities of real data, and the ...
This electronic version was submitted by the student author. The certified thesis is available in th...
Estimating the cardinality (i.e. number of distinct elements) of an arbitrary set expression dened o...
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...
We study the problem of estimating the largest gain of an unknown linear and time-invariant filter, ...
Maintaining frequency counts for data streams has attracted much interest among the research communi...
In the adversarially robust streaming model, a stream of elements is presented to an algorithm and i...
The exact computation of the number of distinct elements (frequency moment F0) is a fundamental prob...
We consider the problem of approximating the frequency of frequently occurring elements in a stream ...
Estimating ranks, quantiles, and distributions over streaming data is a central task in data analysi...
In this paper, we provide the first optimal algorithm for the remaining open question from the semin...