In this dissertation, we make progress on certain algorithmic problems broadly over two computational models: the streaming model for large datasets and the distribution testing model for large probability distributions. First we consider the streaming model, where a large sequence of data items arrives one by one. The computer needs to make one pass over this sequence, processing every item quickly, in a limited space. In Chapter 2 motivated by a bioinformatics application, we consider the problem of estimating the number of low-frequency items in a stream, which has received only a limited theoretical work so far. We give an efficient streaming algorithm for this problem and show its complexity is almost optimal. In Chapter 3 we consider ...
We give an improved algorithm for drawing a random sample from a large data stream when the input el...
Technological progress has encouraged the study of various high-dimensional systems through the lens...
International audienceWe investigate the problem of estimating on the fly the frequency at which ite...
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...
Computing functions over a distributed stream of data is a significant problem with practical applic...
Data streams have emerged as a natural computational model for numerous applications of big data pro...
Exact solutions are unattainable for important problems. The calculations are limited by the memory ...
Distribution testing is a crucial area at the interface of statistics and algorithms, where one wish...
This electronic version was submitted by the student author. The certified thesis is available in th...
The past decade has witnessed many interesting algorithms for maintaining statistics over a data str...
We consider weighted random sampling from distributed data streams presented as a sequence of mini-b...
The field of streaming algorithms has enjoyed a deal of focus from the theoretical computer science ...
We give an improved algorithm for drawing a random sample from a large data stream when the input el...
Technological progress has encouraged the study of various high-dimensional systems through the lens...
International audienceWe investigate the problem of estimating on the fly the frequency at which ite...
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...
Computing functions over a distributed stream of data is a significant problem with practical applic...
Data streams have emerged as a natural computational model for numerous applications of big data pro...
Exact solutions are unattainable for important problems. The calculations are limited by the memory ...
Distribution testing is a crucial area at the interface of statistics and algorithms, where one wish...
This electronic version was submitted by the student author. The certified thesis is available in th...
The past decade has witnessed many interesting algorithms for maintaining statistics over a data str...
We consider weighted random sampling from distributed data streams presented as a sequence of mini-b...
The field of streaming algorithms has enjoyed a deal of focus from the theoretical computer science ...
We give an improved algorithm for drawing a random sample from a large data stream when the input el...
Technological progress has encouraged the study of various high-dimensional systems through the lens...
International audienceWe investigate the problem of estimating on the fly the frequency at which ite...