With the emergence of massive datasets across different application domains, there is a rapidly growing need to solve various optimization tasks over such datasets. This in turn raises the following fundamental question: How well can we solve a large-scale optimization problem on massive datasets in a resource-efficient manner? The focus of this thesis is on answering this question for various problems in different modern computational models for processing massive datasets, in particular, streaming, distributed, and massively parallel computation (such as MapReduce) models. The first part of this thesis is focused on graph optimization, in which we study several fundamental graph optimization problems including matching, vertex cover, and ...
Very recently at SODA'15 [2], we studied maximal matching via the framework of parameterized streami...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Analyzing massive-data sets and streams is computationally very challenging. Data sets in systems bi...
With the emergence of massive datasets across different application domains, there is a rapidly grow...
With the emergence of massive datasets across different application domains, there is a rapidly grow...
Many modern services need to routinely perform tasks on a large scale. This prompts us to consider t...
Although computing power has advanced at an astonishing rate, it has been far outpaced by the growin...
In contrast to the traditional random access memory computational model where the entire input is av...
The recent explosion in size and complexity of datasets and the increased availability of computatio...
Includes bibliographical references (leaves 28-31).Current generation supercomputers have thousands ...
As the scale of the problems we want to solve in real life becomes larger, the input sizes of the pr...
Massive graphs arise in a many scenarios, for example, traffic data analysis in large networks, larg...
We study the classic NP-Hard problem of finding the maximum k-set coverage in the data stream model:...
In this thesis, we study the power and limit of algorithms on various models, aiming at applications...
Multi-pass streaming algorithm for Maximum Matching have been studied since more than 15 years and v...
Very recently at SODA'15 [2], we studied maximal matching via the framework of parameterized streami...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Analyzing massive-data sets and streams is computationally very challenging. Data sets in systems bi...
With the emergence of massive datasets across different application domains, there is a rapidly grow...
With the emergence of massive datasets across different application domains, there is a rapidly grow...
Many modern services need to routinely perform tasks on a large scale. This prompts us to consider t...
Although computing power has advanced at an astonishing rate, it has been far outpaced by the growin...
In contrast to the traditional random access memory computational model where the entire input is av...
The recent explosion in size and complexity of datasets and the increased availability of computatio...
Includes bibliographical references (leaves 28-31).Current generation supercomputers have thousands ...
As the scale of the problems we want to solve in real life becomes larger, the input sizes of the pr...
Massive graphs arise in a many scenarios, for example, traffic data analysis in large networks, larg...
We study the classic NP-Hard problem of finding the maximum k-set coverage in the data stream model:...
In this thesis, we study the power and limit of algorithms on various models, aiming at applications...
Multi-pass streaming algorithm for Maximum Matching have been studied since more than 15 years and v...
Very recently at SODA'15 [2], we studied maximal matching via the framework of parameterized streami...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Analyzing massive-data sets and streams is computationally very challenging. Data sets in systems bi...