In this thesis, we study key questions that touch upon many important problems in practice which are data-intensive. How can we process this influx of data while using a small amount of memory without sacrificing solution quality? We study this question in the context of the classical k-means clustering problem for the streaming model under a data separability assumption. We design a near-optimal streaming approximation algorithm that uses small space and makes one pass over the stream.The streaming model may be too restrictive for certain problems that demand more computational resources. Can we still provide provable guarantees for such applications, where the input arrives online? We consider this question in the context of load bal...
We consider the problem of resource allocation in mining multiple data streams. Due to the large vol...
In many applications, the data is of rich structure that can be represented by a hypergraph, where t...
Abstract. Data stream clustering has wide applications, such as online financial transactions, telep...
Exact solutions are unattainable for important problems. The calculations are limited by the memory ...
We study clustering under the data stream model of computation where: given a sequence of points, th...
We present multiple pass streaming algorithms for a basic clustering problem for massive data sets. ...
Streaming data analysis has recently attracted at-tention in numerous applications including telepho...
This thesis studies clustering problems on data streams, specifically with applications to metric sp...
Data streams are usually generated in an online fashion characterized by huge volume, rapid unpredic...
This electronic version was submitted by the student author. The certified thesis is available in th...
Machine learning algorithms are now being deployed in practically all areas of our lives. Part of th...
The massive growth of modern datasets from different sources such as videos, social networks, and se...
In many engineering and machine learning applications, we often encounter optimization problems (e.g...
Mean Shift is a well-known clustering algorithm that has attractive properties such as the ability t...
In this paper, we consider sparse networks consisting of a finite number of non-overlapping communit...
We consider the problem of resource allocation in mining multiple data streams. Due to the large vol...
In many applications, the data is of rich structure that can be represented by a hypergraph, where t...
Abstract. Data stream clustering has wide applications, such as online financial transactions, telep...
Exact solutions are unattainable for important problems. The calculations are limited by the memory ...
We study clustering under the data stream model of computation where: given a sequence of points, th...
We present multiple pass streaming algorithms for a basic clustering problem for massive data sets. ...
Streaming data analysis has recently attracted at-tention in numerous applications including telepho...
This thesis studies clustering problems on data streams, specifically with applications to metric sp...
Data streams are usually generated in an online fashion characterized by huge volume, rapid unpredic...
This electronic version was submitted by the student author. The certified thesis is available in th...
Machine learning algorithms are now being deployed in practically all areas of our lives. Part of th...
The massive growth of modern datasets from different sources such as videos, social networks, and se...
In many engineering and machine learning applications, we often encounter optimization problems (e.g...
Mean Shift is a well-known clustering algorithm that has attractive properties such as the ability t...
In this paper, we consider sparse networks consisting of a finite number of non-overlapping communit...
We consider the problem of resource allocation in mining multiple data streams. Due to the large vol...
In many applications, the data is of rich structure that can be represented by a hypergraph, where t...
Abstract. Data stream clustering has wide applications, such as online financial transactions, telep...