We present lower bounds on the space required to estimate the quantiles of a stream of numerical values. Quantile estimation is perhaps the most studied problem in the data stream model and it is relatively well understood in the basic single-pass data stream model in which the values are ordered adversarially. Natural extensions of this basic model include the random-order model in which the values are ordered randomly (e.g. [21,5,13,11,12]) and the multi-pass model in which an algorithm is permitted a limited number of passes over the stream (e.g. [6,7,1,19,2,6,7,19,2]). We present lower bounds that complement existing upper bounds [21,11] in both models. One consequence is an exponential separation between the random-order and adversaria...
The need to estimate a particular quantile of a distribution is an important problem that frequently...
Streaming algorithms, which process very large datasets received one update at a time, are a key too...
Quantiles are very important statistics information used to describe the distribution of datasets. G...
When trying to process a data stream in small space, how important is the order in which the data ar...
When trying to process a data stream in small space, how important is the order in which the data ar...
We study the communication complexity of evaluating functions when the input data is randomly alloca...
Recently, there has been an increased focus on modeling uncertainty by distributions. Suppose we wis...
High-volume data streams are too large and grow too quickly to store entirely in working memory, int...
Estimating ranks, quantiles, and distributions over streaming data is a central task in data analysi...
A fundamental problem in data management and analysis is to generate descriptions of the distributio...
A fundamental problem in data management and analysis is to gen-erate descriptions of the distributi...
This LNCS vol. is the Proceedings of FAW 2010This paper studies the space complexity of the ε-approx...
In a recent paper [MRL98], we had described a general framework for single pass approximate quantile...
We present new algorithms for computing approximate quantiles of large datasets in a single pass. Th...
Abstract—The need to estimate a particular quantile of a distribution is an important problem which ...
The need to estimate a particular quantile of a distribution is an important problem that frequently...
Streaming algorithms, which process very large datasets received one update at a time, are a key too...
Quantiles are very important statistics information used to describe the distribution of datasets. G...
When trying to process a data stream in small space, how important is the order in which the data ar...
When trying to process a data stream in small space, how important is the order in which the data ar...
We study the communication complexity of evaluating functions when the input data is randomly alloca...
Recently, there has been an increased focus on modeling uncertainty by distributions. Suppose we wis...
High-volume data streams are too large and grow too quickly to store entirely in working memory, int...
Estimating ranks, quantiles, and distributions over streaming data is a central task in data analysi...
A fundamental problem in data management and analysis is to generate descriptions of the distributio...
A fundamental problem in data management and analysis is to gen-erate descriptions of the distributi...
This LNCS vol. is the Proceedings of FAW 2010This paper studies the space complexity of the ε-approx...
In a recent paper [MRL98], we had described a general framework for single pass approximate quantile...
We present new algorithms for computing approximate quantiles of large datasets in a single pass. Th...
Abstract—The need to estimate a particular quantile of a distribution is an important problem which ...
The need to estimate a particular quantile of a distribution is an important problem that frequently...
Streaming algorithms, which process very large datasets received one update at a time, are a key too...
Quantiles are very important statistics information used to describe the distribution of datasets. G...