In order to process very large graphs, existing graph processing systems, such as Pregel and Giraph, usually partition and distribute the graph computation on large number of nodes (i.e., workers). However, due to the heterogeneity of computing clusters (e.g., nodes with various bandwidth or CPU resource), blindly increasing the number of workers for a job may even degrade the overall performance. In this paper, we address the question of how to distribute the graph computation over the heterogeneous cluster to maximize performance. Based on the practical constraints of current systems, we address this problem in two scenarios. For systems using hash-based partition method (for avoiding the overhead of indexing and searching vertex), we pro...
Allocating tasks to machines in computing clusters is described. In an embodiment a set of tasks ass...
International audienceApplications structured as parallel task graphs exhibit both data and task par...
We investigate the problem of scheduling real-time applications in cluster computing environments. T...
In order to process very large graphs, existing graph processing systems, such as Pregel and Giraph,...
With the increasing availability of graph data and widely adopted cloud computing paradigm, graph pa...
With the increasing availability of graph data and widely adopted cloud computing paradigm, graph pa...
With ever increasing data volumes, large compute clusters that process data in a distributed manner ...
Abstract—In order to improve system performance efficiently, a number of systems choose to equip mul...
The amount of data generated every day is growing exponentially in the big data era. A significant p...
Extracting knowledge by performing computations on graphs is becoming increasingly challenging as gr...
Graph processing is increasingly used in a variety of domains, from engineering to logistics and fro...
Graph partitioning is considered to be a standard solution to process huge graphs efficiently when p...
As the study of large graphs over hundreds of gigabytes becomes increasingly popular for various dat...
Abstract—Among scheduling algorithms of scientific work-flows, the graph partitioning is a technique...
Abstract—Partitioning an input graph over a set of workers is a complex operation. Objectives are tw...
Allocating tasks to machines in computing clusters is described. In an embodiment a set of tasks ass...
International audienceApplications structured as parallel task graphs exhibit both data and task par...
We investigate the problem of scheduling real-time applications in cluster computing environments. T...
In order to process very large graphs, existing graph processing systems, such as Pregel and Giraph,...
With the increasing availability of graph data and widely adopted cloud computing paradigm, graph pa...
With the increasing availability of graph data and widely adopted cloud computing paradigm, graph pa...
With ever increasing data volumes, large compute clusters that process data in a distributed manner ...
Abstract—In order to improve system performance efficiently, a number of systems choose to equip mul...
The amount of data generated every day is growing exponentially in the big data era. A significant p...
Extracting knowledge by performing computations on graphs is becoming increasingly challenging as gr...
Graph processing is increasingly used in a variety of domains, from engineering to logistics and fro...
Graph partitioning is considered to be a standard solution to process huge graphs efficiently when p...
As the study of large graphs over hundreds of gigabytes becomes increasingly popular for various dat...
Abstract—Among scheduling algorithms of scientific work-flows, the graph partitioning is a technique...
Abstract—Partitioning an input graph over a set of workers is a complex operation. Objectives are tw...
Allocating tasks to machines in computing clusters is described. In an embodiment a set of tasks ass...
International audienceApplications structured as parallel task graphs exhibit both data and task par...
We investigate the problem of scheduling real-time applications in cluster computing environments. T...