Abstract. Space efficient algorithms play a central role in dealing with large amount of data. In such settings, one would like to analyse the large data using small amount of “working space”. One of the key steps in many algorithms for analysing large data is to maintain a (or a small number) random sample from the data points. In this paper, we consider two space restricted settings – (i) streaming model, where data arrives over time and one can use only a small amount of storage, and (ii) query model, where we can structure the data in low space and answer sampling queries. In this paper, we prove the following results in above two settings: – In the streaming setting, we would like to maintain a random sample from the elements seen so f...
Consistent sampling is a technique for specifying, in small space, a subset S of a potentially large...
We introduce an alternative to reservoir sampling, a classic and popular algorithm for drawing a fix...
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...
This paper studies the independent range sampling prob-lem. The input is a set P of n points in R. G...
We address the trade-off between the computational resources needed to process a large data set and ...
Exact solutions are unattainable for important problems. The calculations are limited by the memory ...
Random sampling is an appealing approach to build synopses of large data streams because random samp...
The last decade witnessed the extensive studies of algorithms for data streams. In this model, the i...
Abstract –We consider estimation of arbitrary range partitioning of data values and ranking of frequ...
We revisit the range sampling problem: the input is a set of points where each point is associated w...
We present multiple pass streaming algorithms for a basic clustering problem for massive data sets. ...
Abstract We introduce the problem of sampling from a moving window of recent items from a data strea...
In this thesis, we give efficient algorithms and near-tight lower bounds for the following problems ...
In a recent paper [MRL98], we had described a general framework for single pass approximate quantile...
Abstract. Consistent sampling is a technique for specifying, in small space, a subset S of a potenti...
Consistent sampling is a technique for specifying, in small space, a subset S of a potentially large...
We introduce an alternative to reservoir sampling, a classic and popular algorithm for drawing a fix...
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...
This paper studies the independent range sampling prob-lem. The input is a set P of n points in R. G...
We address the trade-off between the computational resources needed to process a large data set and ...
Exact solutions are unattainable for important problems. The calculations are limited by the memory ...
Random sampling is an appealing approach to build synopses of large data streams because random samp...
The last decade witnessed the extensive studies of algorithms for data streams. In this model, the i...
Abstract –We consider estimation of arbitrary range partitioning of data values and ranking of frequ...
We revisit the range sampling problem: the input is a set of points where each point is associated w...
We present multiple pass streaming algorithms for a basic clustering problem for massive data sets. ...
Abstract We introduce the problem of sampling from a moving window of recent items from a data strea...
In this thesis, we give efficient algorithms and near-tight lower bounds for the following problems ...
In a recent paper [MRL98], we had described a general framework for single pass approximate quantile...
Abstract. Consistent sampling is a technique for specifying, in small space, a subset S of a potenti...
Consistent sampling is a technique for specifying, in small space, a subset S of a potentially large...
We introduce an alternative to reservoir sampling, a classic and popular algorithm for drawing a fix...
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...