Abstract. The increasing number of processing elements and decreas-ing memory to core ratio in modern high-performance platforms makes efficient strong scaling a key requirement for numerical algorithms. In order to achieve efficient scalability on massively parallel systems scien-tific software must evolve across the entire stack to exploit the multiple levels of parallelism exposed in modern architectures. In this paper we demonstrate the use of hybrid MPI/OpenMP parallelisation to optimise parallel sparse matrix-vector multiplication in PETSc, a widely used sci-entific library for the scalable solution of partial differential equations. Using large matrices generated by Fluidity, an open source CFD appli-cation code which uses PETSc as i...
This paper describes the implementation of an hybrid OpenMP/MPI parallelization strategy in a Disco...
Today most systems in high-performance computing (HPC) feature a hierarchical hardware design: share...
This work is devoted to the development of efficient parallel algorithms for the direct numerical si...
The increasing number of processing elements and decreasing memory to core ratio in modern high-perf...
The trend towards highly parallel multi-processing is ubiquitous in all modern computer architecture...
Solution of large sparse linear systems is frequently the most time consuming operation in computati...
We present our work on developing a hybrid parallel programming model for a general finite element s...
The Portable, Extensible, Toolkit for Scientific Computation (PETSc) library package is a popular co...
Massively-parallel devices of various architectures are being adopted by the newest supercomputers t...
tract W-31-109-Eng-38. 2 This manual describes the use of PETSc for the numerical solution of partia...
This paper has been submitted to the CFD 2014 conference. To leverage the last two decades ’ transit...
This paper describes the implementation of an hybrid OpenMP/MPI parallelization strategy in a Discon...
We present our work on developing a hybrid parallel programming model for a general finite element s...
This manual describes the use of PETSc 2.0 for the numerical solution of partial differential equati...
A hybrid scheme that utilizes MPI for distributed memory parallelism and OpenMP for shared memory pa...
This paper describes the implementation of an hybrid OpenMP/MPI parallelization strategy in a Disco...
Today most systems in high-performance computing (HPC) feature a hierarchical hardware design: share...
This work is devoted to the development of efficient parallel algorithms for the direct numerical si...
The increasing number of processing elements and decreasing memory to core ratio in modern high-perf...
The trend towards highly parallel multi-processing is ubiquitous in all modern computer architecture...
Solution of large sparse linear systems is frequently the most time consuming operation in computati...
We present our work on developing a hybrid parallel programming model for a general finite element s...
The Portable, Extensible, Toolkit for Scientific Computation (PETSc) library package is a popular co...
Massively-parallel devices of various architectures are being adopted by the newest supercomputers t...
tract W-31-109-Eng-38. 2 This manual describes the use of PETSc for the numerical solution of partia...
This paper has been submitted to the CFD 2014 conference. To leverage the last two decades ’ transit...
This paper describes the implementation of an hybrid OpenMP/MPI parallelization strategy in a Discon...
We present our work on developing a hybrid parallel programming model for a general finite element s...
This manual describes the use of PETSc 2.0 for the numerical solution of partial differential equati...
A hybrid scheme that utilizes MPI for distributed memory parallelism and OpenMP for shared memory pa...
This paper describes the implementation of an hybrid OpenMP/MPI parallelization strategy in a Disco...
Today most systems in high-performance computing (HPC) feature a hierarchical hardware design: share...
This work is devoted to the development of efficient parallel algorithms for the direct numerical si...