The scaling behavior of different OpenFOAM versions is analyzed on two benchmark problems. Results show that the applications scale reasonably well up to a thousand tasks. An in-depth profiling identifies the calls to the MPI_Allreduce function in the linear algebra core libraries as the main communication bottleneck. A sub-optimal performance on-core is due to the sparse matrices storage format that does not employ any cache-blocking mechanism at present. Possible strategies to overcome these limitations are proposed and analyzed, and preliminary results on prototype implementations are presented
We present a study of the architectural requirements and scalability of the NAS Parallel Benchmarks....
Programmability and performance-per-watt are the major challenges of the race to Exascale. In this s...
Parallelizing sparse irregular application on distributed memory systems poses serious scalability c...
The scaling behavior of different OpenFOAM versions is analyzed on two benchmark problems. Results s...
The Forschungszentrum Juelich is utilizing OpenFOAM to perform calculations for electrochemical devi...
The most widely used node type in high-performance computing nowadays is a 2-socket server node. The...
As the dawn of the exascale era arrives, high-performance computing (HPC) researchers continue to se...
OpenFOAM, an open source industrial Computational Fluid Dynamics (CFD) tool, which contains dozens o...
The performance results from the hybridization of the OpenFOAM linear system solver, tested on the C...
Solution of large sparse linear systems is frequently the most time consuming operation in computati...
While the growing number of cores per chip allows researchers to solve larger scientific and enginee...
Over the last few decades, Message Passing Interface (MPI) has become the parallel-communication sta...
International audienceWe discuss efficient shared memory parallelization of sparse matrix computatio...
Dense linear algebra libraries need to cope efficiently with a range of input problem sizes and shap...
Multiphase flow solvers are widely-used applications in OpenFOAM, whose scalability suffers from the...
We present a study of the architectural requirements and scalability of the NAS Parallel Benchmarks....
Programmability and performance-per-watt are the major challenges of the race to Exascale. In this s...
Parallelizing sparse irregular application on distributed memory systems poses serious scalability c...
The scaling behavior of different OpenFOAM versions is analyzed on two benchmark problems. Results s...
The Forschungszentrum Juelich is utilizing OpenFOAM to perform calculations for electrochemical devi...
The most widely used node type in high-performance computing nowadays is a 2-socket server node. The...
As the dawn of the exascale era arrives, high-performance computing (HPC) researchers continue to se...
OpenFOAM, an open source industrial Computational Fluid Dynamics (CFD) tool, which contains dozens o...
The performance results from the hybridization of the OpenFOAM linear system solver, tested on the C...
Solution of large sparse linear systems is frequently the most time consuming operation in computati...
While the growing number of cores per chip allows researchers to solve larger scientific and enginee...
Over the last few decades, Message Passing Interface (MPI) has become the parallel-communication sta...
International audienceWe discuss efficient shared memory parallelization of sparse matrix computatio...
Dense linear algebra libraries need to cope efficiently with a range of input problem sizes and shap...
Multiphase flow solvers are widely-used applications in OpenFOAM, whose scalability suffers from the...
We present a study of the architectural requirements and scalability of the NAS Parallel Benchmarks....
Programmability and performance-per-watt are the major challenges of the race to Exascale. In this s...
Parallelizing sparse irregular application on distributed memory systems poses serious scalability c...