In data warehousing applications, numerous OLAP queries involve the processing of holistic operations such as computing the "top N", median, etc. Efficient implementations of these operations are hard to come by. Several algorithms have been proposed in the literature that estimate various quantiles of disk-resident data. Two such recent algorithms are based on sampling. In this paper we present two novel and efficient quantiling algorithms, Deterministic Bucketing (DB) and Randomized Bucketing (RB). We have analyzed the performance of DB and RB and extended the analysis of the sampling done in prior algorithms. We have conducted extensive experiments to compare all these four algorithms. Our experimental data indicate that our ne...
. We analyze the storage/accuracy trade--off of an adaptive sampling algorithm due to Wegman that ma...
The estimation of the quantiles is pertinent when one is mining data streams. However, the complexit...
Representing a continuous random variable by a finite number of values is known as quantization. Giv...
The '-quantile of an ordered sequence of data values is the element with rank ' \Theta n, ...
We present new algorithms for computing approximate quantiles of large datasets in a single pass. Th...
In a recent paper [MRL98], we had described a general framework for single pass approximate quantile...
In data warehousing applications, numerous OLAP queries involve the processing of holistic aggregato...
A fundamental problem in data management and analysis is to gen-erate descriptions of the distributi...
A fundamental problem in data management and analysis is to generate descriptions of the distributio...
A fundamental problem in data management and analysis is to generate descriptions of the distributio...
We develop a simple Quantile Spacing (QS) method for accurate probabilistic estimation of one-dimens...
We develop a simple Quantile Spacing (QS) method for accurate probabilistic estimation of one‐dimens...
Quantiles are very important statistics information used to describe the distribution of datasets. G...
Abstract Order statistics, i.e., quantiles, are frequently used in databases both at the database se...
The traditional estimator ˆξp,n for the p-quantile ξp of a random variable X, given n observations f...
. We analyze the storage/accuracy trade--off of an adaptive sampling algorithm due to Wegman that ma...
The estimation of the quantiles is pertinent when one is mining data streams. However, the complexit...
Representing a continuous random variable by a finite number of values is known as quantization. Giv...
The '-quantile of an ordered sequence of data values is the element with rank ' \Theta n, ...
We present new algorithms for computing approximate quantiles of large datasets in a single pass. Th...
In a recent paper [MRL98], we had described a general framework for single pass approximate quantile...
In data warehousing applications, numerous OLAP queries involve the processing of holistic aggregato...
A fundamental problem in data management and analysis is to gen-erate descriptions of the distributi...
A fundamental problem in data management and analysis is to generate descriptions of the distributio...
A fundamental problem in data management and analysis is to generate descriptions of the distributio...
We develop a simple Quantile Spacing (QS) method for accurate probabilistic estimation of one-dimens...
We develop a simple Quantile Spacing (QS) method for accurate probabilistic estimation of one‐dimens...
Quantiles are very important statistics information used to describe the distribution of datasets. G...
Abstract Order statistics, i.e., quantiles, are frequently used in databases both at the database se...
The traditional estimator ˆξp,n for the p-quantile ξp of a random variable X, given n observations f...
. We analyze the storage/accuracy trade--off of an adaptive sampling algorithm due to Wegman that ma...
The estimation of the quantiles is pertinent when one is mining data streams. However, the complexit...
Representing a continuous random variable by a finite number of values is known as quantization. Giv...