Streaming applications frequently encounter skewed workloads and execute on heterogeneous clusters. Optimal re- source utilization in such adverse conditions becomes a challenge, as it requires inferring the resource capacities and input distribution at run time. In this paper, we tackle the aforementioned challenges by modeling them as a load balancing problem. We propose a novel partitioning strategy called Consistent Grouping (CG), which enables each processing element instance (PEI) to process the workload according to its capacity. The main idea behind CG is the notion of small, equal-sized “virtual workers” at the sources, which are assigned to physical workers based on their capacities. We provide a theoretical analysis of the propos...
In order to process very large graphs, existing graph processing systems, such as Pregel and Giraph,...
In order to process very large graphs, existing graph processing systems, such as Pregel and Giraph,...
In a multicluster architecture, where jobs can be submitted through each constituent cluster, the jo...
Streaming applications frequently encounter skewed workloads and execute on heterogeneous clusters. ...
We study the problem of load balancing in distributed stream processing engines, which is exacerbate...
Key grouping is a technique used by stream processing frameworks to simplify the development of para...
International audienceKey grouping is a technique used by stream processing frameworks to simplify t...
With ever increasing data volumes, large compute clusters that process data in a distributed manner ...
With the increasing availability of graph data and widely adopted cloud computing paradigm, graph pa...
Abstract. The goal of load balancing is to assign to each node a number of tasks proportional to its...
With the increasing availability of graph data and widely adopted cloud computing paradigm, graph pa...
Stream Processing has become a major programming model to timely handle large volumes of data genera...
International audienceKey grouping is a technique used by stream processing frame- works to simplify...
As many-core accelerators keep integrating more processing units, it becomes increasingly more diffi...
Scalability in stream processing systems can be achieved by using a cluster of computing devices. Th...
In order to process very large graphs, existing graph processing systems, such as Pregel and Giraph,...
In order to process very large graphs, existing graph processing systems, such as Pregel and Giraph,...
In a multicluster architecture, where jobs can be submitted through each constituent cluster, the jo...
Streaming applications frequently encounter skewed workloads and execute on heterogeneous clusters. ...
We study the problem of load balancing in distributed stream processing engines, which is exacerbate...
Key grouping is a technique used by stream processing frameworks to simplify the development of para...
International audienceKey grouping is a technique used by stream processing frameworks to simplify t...
With ever increasing data volumes, large compute clusters that process data in a distributed manner ...
With the increasing availability of graph data and widely adopted cloud computing paradigm, graph pa...
Abstract. The goal of load balancing is to assign to each node a number of tasks proportional to its...
With the increasing availability of graph data and widely adopted cloud computing paradigm, graph pa...
Stream Processing has become a major programming model to timely handle large volumes of data genera...
International audienceKey grouping is a technique used by stream processing frame- works to simplify...
As many-core accelerators keep integrating more processing units, it becomes increasingly more diffi...
Scalability in stream processing systems can be achieved by using a cluster of computing devices. Th...
In order to process very large graphs, existing graph processing systems, such as Pregel and Giraph,...
In order to process very large graphs, existing graph processing systems, such as Pregel and Giraph,...
In a multicluster architecture, where jobs can be submitted through each constituent cluster, the jo...