We present the most recent release of our parallel implementation of the BFS and BC algorithms for the study of large scale graphs. Although our reference platform is a high-end cluster of new generation Nvidia GPUs and some of our optimisations are CUDA specific, most of our ideas can be applied to other platforms offering multiple levels of parallelism. We exploit multi level parallel processing through a hybrid programming paradigm that combines highly tuned CUDA kernels, for the computations performed by each node, and explicit data exchange through the Message Passing Interface (MPI), for the communications among nodes. The results of the numerical experiments show that the performance of our code is comparable or better with respect t...
Recent advances in the design of efficient parallel algorithms have been largely focusing on the now...
pre-printFast, scalable, low-cost, and low-power execution of parallel graph algorithms is important...
We investigate multi-level parallelism on GPU clusters with MPI-CUDA and hybrid MPI-OpenMP-CUDA para...
Modern Graphics Processing Units (GPUs) provide high computation power at low costs and have been de...
One of the main activities within the Group for Scientific Computing at the Faculty of Science are m...
It seems natural to use the GPUs (Graphical Processing Units) for performing analytics on big graphs...
When working on graphs, reachability is among the most common problems to address, since it is the b...
Abstract—Graphs that model social networks, numerical sim-ulations, and the structure of the Interne...
Parallel graph algorithms have become one of the principal applications of high-performance computin...
Abstract — In many practical applications include image processing, space searching, network analysi...
There has been significant recent interest in parallel graph processing due to the need to quickly a...
Abstract. Large graphs involving millions of vertices are common in many prac-tical applications and...
Breadth-first search (BFS) is one of the most common graph traversal algorithms and the building blo...
| openaire: EC/H2020/818665/EU//UniSDyn Funding Information: This work was supported by the Academy ...
We consider sequential algorithms for hypergraph partitioning and GPU (i.e., fine-grained shared-mem...
Recent advances in the design of efficient parallel algorithms have been largely focusing on the now...
pre-printFast, scalable, low-cost, and low-power execution of parallel graph algorithms is important...
We investigate multi-level parallelism on GPU clusters with MPI-CUDA and hybrid MPI-OpenMP-CUDA para...
Modern Graphics Processing Units (GPUs) provide high computation power at low costs and have been de...
One of the main activities within the Group for Scientific Computing at the Faculty of Science are m...
It seems natural to use the GPUs (Graphical Processing Units) for performing analytics on big graphs...
When working on graphs, reachability is among the most common problems to address, since it is the b...
Abstract—Graphs that model social networks, numerical sim-ulations, and the structure of the Interne...
Parallel graph algorithms have become one of the principal applications of high-performance computin...
Abstract — In many practical applications include image processing, space searching, network analysi...
There has been significant recent interest in parallel graph processing due to the need to quickly a...
Abstract. Large graphs involving millions of vertices are common in many prac-tical applications and...
Breadth-first search (BFS) is one of the most common graph traversal algorithms and the building blo...
| openaire: EC/H2020/818665/EU//UniSDyn Funding Information: This work was supported by the Academy ...
We consider sequential algorithms for hypergraph partitioning and GPU (i.e., fine-grained shared-mem...
Recent advances in the design of efficient parallel algorithms have been largely focusing on the now...
pre-printFast, scalable, low-cost, and low-power execution of parallel graph algorithms is important...
We investigate multi-level parallelism on GPU clusters with MPI-CUDA and hybrid MPI-OpenMP-CUDA para...