The trend towards highly parallel multi-processing is ubiquitous in all modern computer architectures, ranging from handheld devices to large-scale HPC systems; yet many applications are struggling to fully utilise the multiple levels of parallelism exposed in modern high-performance platforms. In order to realise the full potential of recent hardware advances, a mixed-mode between shared-memory programming techniques and inter-node message passing can be adopted which provides high-levels of parallelism with minimal overheads. For scientific applications this entails that not only the simulation code itself, but the whole software stack needs to evolve. In this paper, we evaluate the mixed-mode performance of PETSc, a widely used scientifi...
We present our work on developing a hybrid parallel programming model for a general finite element s...
Abstract. The Sparse Matrix-Vector Multiplication is the key operation in many iterative methods. Th...
With a large variety and complexity of existing HPC machines and uncertainty regarding exact future ...
The increasing number of processing elements and decreasing memory to core ratio in modern high-perf...
Abstract. The increasing number of processing elements and decreas-ing memory to core ratio in moder...
International audienceLarge applications for parallel computers and more specifically unstructured C...
In this paper, we highlight our progress in implementing a hybrid OpenMP-MPI version of the unstruct...
Two paradigms for distributed-memory parallel computation that free the application programmer from ...
AbstractIn this paper, we highlight our progress in implementing a hybrid OpenMP-MPI version of the ...
The Portable, Extensible, Toolkit for Scientific Computation (PETSc) library package is a popular co...
Massively-parallel devices of various architectures are being adopted by the newest supercomputers t...
The mixing of shared memory and message passing programming models within a single application has o...
The complexity of the latest HPC architectures increasingly limits the productivity of researchers i...
This dissertation studies the sources of poor performance in scientific computing codes based on par...
Many/multi-core supercomputers provide a natural programming paradigm for hybrid MPI/OpenMP scientif...
We present our work on developing a hybrid parallel programming model for a general finite element s...
Abstract. The Sparse Matrix-Vector Multiplication is the key operation in many iterative methods. Th...
With a large variety and complexity of existing HPC machines and uncertainty regarding exact future ...
The increasing number of processing elements and decreasing memory to core ratio in modern high-perf...
Abstract. The increasing number of processing elements and decreas-ing memory to core ratio in moder...
International audienceLarge applications for parallel computers and more specifically unstructured C...
In this paper, we highlight our progress in implementing a hybrid OpenMP-MPI version of the unstruct...
Two paradigms for distributed-memory parallel computation that free the application programmer from ...
AbstractIn this paper, we highlight our progress in implementing a hybrid OpenMP-MPI version of the ...
The Portable, Extensible, Toolkit for Scientific Computation (PETSc) library package is a popular co...
Massively-parallel devices of various architectures are being adopted by the newest supercomputers t...
The mixing of shared memory and message passing programming models within a single application has o...
The complexity of the latest HPC architectures increasingly limits the productivity of researchers i...
This dissertation studies the sources of poor performance in scientific computing codes based on par...
Many/multi-core supercomputers provide a natural programming paradigm for hybrid MPI/OpenMP scientif...
We present our work on developing a hybrid parallel programming model for a general finite element s...
Abstract. The Sparse Matrix-Vector Multiplication is the key operation in many iterative methods. Th...
With a large variety and complexity of existing HPC machines and uncertainty regarding exact future ...